braintrust
An isomorphic JS library for working with Braintrust. This library contains functionality for running evaluations, logging completions, loading and invoking functions, and more.
braintrust is distributed as a library on NPM.
It is also open source and available on GitHub.
Quickstart
Install the library with npm (or yarn).
Then, create a file like hello.eval.ts with the following content:
Finally, run the script with npx braintrust eval hello.eval.ts.
Classes
- Attachment
- BraintrustState
- BraintrustStream
- CodeFunction
- CodePrompt
- Dataset
- Experiment
- FailedHTTPResponse
- LazyValue
- Logger
- NoopSpan
- Project
- Prompt
- PromptBuilder
- ReadonlyAttachment
- ReadonlyExperiment
- ScorerBuilder
- SpanImpl
- ToolBuilder
Interfaces
- AttachmentParams
- BackgroundLoggerOpts
- DataSummary
- DatasetSummary
- Evaluator
- ExperimentSummary
- Exportable
- InvokeFunctionArgs
- LogOptions
- LoginOptions
- MetricSummary
- ObjectMetadata
- ParentExperimentIds
- ParentProjectLogIds
- ReporterBody
- ScoreSummary
- Span
Namespaces
Functions
BaseExperiment
▸ BaseExperiment<Input, Expected, Metadata>(options?): BaseExperiment<Input, Expected, Metadata>
Use this to specify that the dataset should actually be the data from a previous (base) experiment. If you do not specify a name, Braintrust will automatically figure out the best base experiment to use based on your git history (or fall back to timestamps).
Type parameters
| Name | Type |
|---|---|
Input | unknown |
Expected | unknown |
Metadata | extends BaseMetadata = void |
Parameters
| Name | Type | Description |
|---|---|---|
options | Object | |
options.name? | string | The name of the base experiment to use. If unspecified, Braintrust will automatically figure out the best base using your git history (or fall back to timestamps). |
Returns
BaseExperiment<Input, Expected, Metadata>
Eval
▸ Eval<Input, Output, Expected, Metadata, EvalReport>(name, evaluator, reporterOrOpts?): Promise<EvalResultWithSummary<Input, Output, Expected, Metadata>>
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | void |
Metadata | extends BaseMetadata = void |
EvalReport | boolean |
Parameters
| Name | Type |
|---|---|
name | string |
evaluator | Evaluator<Input, Output, Expected, Metadata> |
reporterOrOpts? | string | ReporterDef<EvalReport> | EvalOptions<EvalReport> |
Returns
Promise<EvalResultWithSummary<Input, Output, Expected, Metadata>>
Reporter
▸ Reporter<EvalReport>(name, reporter): ReporterDef<EvalReport>
Type parameters
| Name |
|---|
EvalReport |
Parameters
| Name | Type |
|---|---|
name | string |
reporter | ReporterBody<EvalReport> |
Returns
ReporterDef<EvalReport>
buildLocalSummary
▸ buildLocalSummary(evaluator, results): ExperimentSummary
Parameters
| Name | Type |
|---|---|
evaluator | EvaluatorDef<any, any, any, any> |
results | EvalResult<any, any, any, any>[] |
Returns
createFinalValuePassThroughStream
▸ createFinalValuePassThroughStream<T>(onFinal, onError): TransformStream<T, BraintrustStreamChunk>
Create a stream that passes through the final value of the stream. This is
used to implement BraintrustStream.finalValue().
Type parameters
| Name | Type |
|---|---|
T | extends string | Uint8Array | { data: string ; type: "text_delta" } | { data: string ; type: "json_delta" } | { data: string ; type: "error" } | { data: { message: string ; stream: "stderr" | "stdout" } = sseConsoleEventDataSchema; type: "console" } | { data: { data: string ; event: "error" | "text_delta" | "json_delta" | "console" | "start" | "done" ; format: "code" | "global" | "llm" ; id: string ; name: string ; object_type: "prompt" | "tool" | "scorer" | "task" ; output_type: "completion" | "score" | "any" } = sseProgressEventDataSchema; type: "progress" } | { data: string ; type: "start" } | { data: string ; type: "done" } |
Parameters
| Name | Type | Description |
|---|---|---|
onFinal | (result: unknown) => void | A function to call with the final value of the stream. |
onError | (error: unknown) => void | - |
Returns
TransformStream<T, BraintrustStreamChunk>
A new stream that passes through the final value of the stream.
currentExperiment
▸ currentExperiment(options?): Experiment | undefined
Returns the currently-active experiment (set by init). Returns undefined if no current experiment has been set.
Parameters
| Name | Type |
|---|---|
options? | OptionalStateArg |
Returns
Experiment | undefined
currentLogger
▸ currentLogger<IsAsyncFlush>(options?): Logger<IsAsyncFlush> | undefined
Returns the currently-active logger (set by initLogger). Returns undefined if no current logger has been set.
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean |
Parameters
| Name | Type |
|---|---|
options? | AsyncFlushArg<IsAsyncFlush> & OptionalStateArg |
Returns
Logger<IsAsyncFlush> | undefined
currentSpan
▸ currentSpan(options?): Span
Return the currently-active span for logging (set by one of the traced methods). If there is no active span, returns a no-op span object, which supports the same interface as spans but does no logging.
See Span for full details.
Parameters
| Name | Type |
|---|---|
options? | OptionalStateArg |
Returns
devNullWritableStream
▸ devNullWritableStream(): WritableStream
Returns
WritableStream
flush
▸ flush(options?): Promise<void>
Flush any pending rows to the server.
Parameters
| Name | Type |
|---|---|
options? | OptionalStateArg |
Returns
Promise<void>
getSpanParentObject
▸ getSpanParentObject<IsAsyncFlush>(options?): Span | Experiment | Logger<IsAsyncFlush>
Mainly for internal use. Return the parent object for starting a span in a global context.
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean |
Parameters
| Name | Type |
|---|---|
options? | AsyncFlushArg<IsAsyncFlush> & OptionalStateArg |
Returns
Span | Experiment | Logger<IsAsyncFlush>
init
▸ init<IsOpen>(options): InitializedExperiment<IsOpen>
Log in, and then initialize a new experiment in a specified project. If the project does not exist, it will be created.
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean = false |
Parameters
| Name | Type | Description |
|---|---|---|
options | Readonly<FullInitOptions<IsOpen>> | Options for configuring init(). |
Returns
InitializedExperiment<IsOpen>
The newly created Experiment.
▸ init<IsOpen>(project, options?): InitializedExperiment<IsOpen>
Legacy form of init which accepts the project name as the first parameter,
separately from the remaining options. See init(options) for full details.
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean = false |
Parameters
| Name | Type |
|---|---|
project | string |
options? | Readonly<InitOptions<IsOpen>> |
Returns
InitializedExperiment<IsOpen>
initDataset
▸ initDataset<IsLegacyDataset>(options): Dataset<IsLegacyDataset>
Create a new dataset in a specified project. If the project does not exist, it will be created.
Type parameters
| Name | Type |
|---|---|
IsLegacyDataset | extends boolean = false |
Parameters
| Name | Type | Description |
|---|---|---|
options | Readonly<FullInitDatasetOptions<IsLegacyDataset>> | Options for configuring initDataset(). |
Returns
Dataset<IsLegacyDataset>
The newly created Dataset.
▸ initDataset<IsLegacyDataset>(project, options?): Dataset<IsLegacyDataset>
Legacy form of initDataset which accepts the project name as the first
parameter, separately from the remaining options.
See initDataset(options) for full details.
Type parameters
| Name | Type |
|---|---|
IsLegacyDataset | extends boolean = false |
Parameters
| Name | Type |
|---|---|
project | string |
options? | Readonly<InitDatasetOptions<IsLegacyDataset>> |
Returns
Dataset<IsLegacyDataset>
initExperiment
▸ initExperiment<IsOpen>(options): InitializedExperiment<IsOpen>
Alias for init(options).
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean = false |
Parameters
| Name | Type |
|---|---|
options | Readonly<InitOptions<IsOpen>> |
Returns
InitializedExperiment<IsOpen>
▸ initExperiment<IsOpen>(project, options?): InitializedExperiment<IsOpen>
Alias for init(project, options).
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean = false |
Parameters
| Name | Type |
|---|---|
project | string |
options? | Readonly<InitOptions<IsOpen>> |
Returns
InitializedExperiment<IsOpen>
initLogger
▸ initLogger<IsAsyncFlush>(options?): Logger<IsAsyncFlush>
Create a new logger in a specified project. If the project does not exist, it will be created.
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean = true |
Parameters
| Name | Type | Description |
|---|---|---|
options | Readonly<InitLoggerOptions<IsAsyncFlush>> | Additional options for configuring init(). |
Returns
Logger<IsAsyncFlush>
The newly created Logger.
invoke
▸ invoke<Input, Output, Stream>(args): Promise<InvokeReturn<Stream, Output>>
Invoke a Braintrust function, returning a BraintrustStream or the value as a plain
Javascript object.
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Stream | extends boolean = false |
Parameters
| Name | Type | Description |
|---|---|---|
args | InvokeFunctionArgs<Input, Output, Stream> & LoginOptions & { forceLogin?: boolean } | The arguments for the function (see InvokeFunctionArgs for more details). |
Returns
Promise<InvokeReturn<Stream, Output>>
The output of the function.
loadPrompt
▸ loadPrompt(options): Promise<Prompt<true, true>>
Load a prompt from the specified project.
Parameters
| Name | Type | Description |
|---|---|---|
options | LoadPromptOptions | Options for configuring loadPrompt(). |
Returns
Promise<Prompt<true, true>>
The prompt object.
Throws
If the prompt is not found.
Throws
If multiple prompts are found with the same slug in the same project (this should never happen).
Example
log
▸ log(event): string
Log a single event to the current experiment. The event will be batched and uploaded behind the scenes.
Parameters
| Name | Type | Description |
|---|---|---|
event | ExperimentLogFullArgs | The event to log. See Experiment.log for full details. |
Returns
string
The id of the logged event.
logError
▸ logError(span, error): void
Parameters
| Name | Type |
|---|---|
span | Span |
error | unknown |
Returns
void
login
▸ login(options?): Promise<BraintrustState>
Log into Braintrust. This will prompt you for your API token, which you can find at
https://www.braintrust.dev/app/token. This method is called automatically by init().
Parameters
| Name | Type | Description |
|---|---|---|
options | LoginOptions & { forceLogin?: boolean } | Options for configuring login(). |
Returns
Promise<BraintrustState>
loginToState
▸ loginToState(options?): Promise<BraintrustState>
Parameters
| Name | Type |
|---|---|
options | LoginOptions |
Returns
Promise<BraintrustState>
newId
▸ newId(): string
Returns
string
parseCachedHeader
▸ parseCachedHeader(value): number | undefined
Parameters
| Name | Type |
|---|---|
value | undefined | null | string |
Returns
number | undefined
permalink
▸ permalink(slug, opts?): Promise<string>
Format a permalink to the Braintrust application for viewing the span
represented by the provided slug.
Links can be generated at any time, but they will only become viewable after the span and its root have been flushed to the server and ingested.
If you have a Span object, use Span.permalink instead.
Parameters
| Name | Type | Description |
|---|---|---|
slug | string | The identifier generated from Span.export. |
opts? | Object | Optional arguments. |
opts.appUrl? | string | The app URL to use. If not provided, the app URL will be inferred from the state. |
opts.orgName? | string | The org name to use. If not provided, the org name will be inferred from the state. |
opts.state? | BraintrustState | The login state to use. If not provided, the global state will be used. |
Returns
Promise<string>
A permalink to the exported span.
renderMessage
▸ renderMessage<T>(render, message): T
Type parameters
| Name | Type |
|---|---|
T | extends { content: string ; name?: string ; role: "system" } | { content: {} ; name?: string ; role: "user" } | { content?: null | string ; function_call?: { arguments: string ; name: string } ; name?: string ; role: "assistant" ; tool_calls?: { function: { arguments: string ; name: string } ; id: string ; type: "function" }[] } | { content: string ; role: "tool" ; tool_call_id: string } | { content: string ; name: string ; role: "function" } | { content?: null | string ; role: "model" } |
Parameters
| Name | Type |
|---|---|
render | (template: string) => string |
message | T |
Returns
T
reportFailures
▸ reportFailures<Input, Output, Expected, Metadata>(evaluator, failingResults, «destructured»): void
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata |
Parameters
| Name | Type |
|---|---|
evaluator | EvaluatorDef<Input, Output, Expected, Metadata> |
failingResults | EvalResult<Input, Output, Expected, Metadata>[] |
«destructured» | ReporterOpts |
Returns
void
setFetch
▸ setFetch(fetch): void
Set the fetch implementation to use for requests. You can specify it here,
or when you call login.
Parameters
| Name | Type | Description |
|---|---|---|
fetch | (input: URL | RequestInfo, init?: RequestInit) => Promise<Response>(input: string | URL | Request, init?: RequestInit) => Promise<Response> | MDN Reference |
Returns
void
spanComponentsToObjectId
▸ spanComponentsToObjectId(«destructured»): Promise<string>
Parameters
| Name | Type |
|---|---|
«destructured» | Object |
› components | SpanComponentsV3 |
› state? | BraintrustState |
Returns
Promise<string>
startSpan
▸ startSpan<IsAsyncFlush>(args?): Span
Lower-level alternative to traced. This allows you to start a span yourself, and can be useful in situations
where you cannot use callbacks. However, spans started with startSpan will not be marked as the "current span",
so currentSpan() and traced() will be no-ops. If you want to mark a span as current, use traced instead.
See traced for full details.
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean = false |
Parameters
| Name | Type |
|---|---|
args? | StartSpanArgs & AsyncFlushArg<IsAsyncFlush> & OptionalStateArg |
Returns
summarize
▸ summarize(options?): Promise<ExperimentSummary>
Summarize the current experiment, including the scores (compared to the closest reference experiment) and metadata.
Parameters
| Name | Type | Description |
|---|---|---|
options | Object | Options for summarizing the experiment. |
options.comparisonExperimentId? | string | The experiment to compare against. If None, the most recent experiment on the origin's main branch will be used. |
options.summarizeScores? | boolean | Whether to summarize the scores. If False, only the metadata will be returned. |
Returns
Promise<ExperimentSummary>
A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.
traceable
▸ traceable<F, IsAsyncFlush>(fn, args?): IsAsyncFlush extends false ? (...args: Parameters<F>) => Promise<Awaited<ReturnType<F>>> : F
A synonym for wrapTraced. If you're porting from systems that use traceable, you can use this to
make your codebase more consistent.
Type parameters
| Name | Type |
|---|---|
F | extends (...args: any[]) => any |
IsAsyncFlush | extends boolean = true |
Parameters
| Name | Type |
|---|---|
fn | F |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg<IsAsyncFlush> |
Returns
IsAsyncFlush extends false ? (...args: Parameters<F>) => Promise<Awaited<ReturnType<F>>> : F
traced
▸ traced<IsAsyncFlush, R>(callback, args?): PromiseUnless<IsAsyncFlush, R>
Toplevel function for starting a span. It checks the following (in precedence order):
- Currently-active span
- Currently-active experiment
- Currently-active logger
and creates a span under the first one that is active. Alternatively, if parent is specified, it creates a span under the specified parent row. If none of these are active, it returns a no-op span object.
See Span.traced for full details.
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean = true |
R | void |
Parameters
| Name | Type |
|---|---|
callback | (span: Span) => R |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg<IsAsyncFlush> & OptionalStateArg |
Returns
PromiseUnless<IsAsyncFlush, R>
updateSpan
▸ updateSpan(«destructured»): void
Update a span using the output of span.export(). It is important that you only resume updating
to a span once the original span has been fully written and flushed, since otherwise updates to
the span may conflict with the original span.
Parameters
| Name | Type |
|---|---|
«destructured» | { exported: string } & Omit<Partial<ExperimentEvent>, "id"> & OptionalStateArg |
Returns
void
withCurrent
▸ withCurrent<R>(span, callback, state?): R
Runs the provided callback with the span as the current span.
Type parameters
| Name |
|---|
R |
Parameters
| Name | Type | Default value |
|---|---|---|
span | Span | undefined |
callback | (span: Span) => R | undefined |
state | BraintrustState | _globalState |
Returns
R
withDataset
▸ withDataset<R, IsLegacyDataset>(project, callback, options?): R
Type parameters
| Name | Type |
|---|---|
R | R |
IsLegacyDataset | extends boolean = false |
Parameters
| Name | Type |
|---|---|
project | string |
callback | (dataset: Dataset<IsLegacyDataset>) => R |
options | Readonly<InitDatasetOptions<IsLegacyDataset>> |
Returns
R
Deprecated
Use initDataset instead.
withExperiment
▸ withExperiment<R>(project, callback, options?): R
Type parameters
| Name |
|---|
R |
Parameters
| Name | Type |
|---|---|
project | string |
callback | (experiment: Experiment) => R |
options | Readonly<LoginOptions & { forceLogin?: boolean } & { baseExperiment?: string ; baseExperimentId?: string ; dataset?: AnyDataset ; description?: string ; experiment?: string ; gitMetadataSettings?: { collect: "some" | "none" | "all" ; fields?: ("dirty" | "tag" | "commit" | "branch" | "author_name" | "author_email" | "commit_message" | "commit_time" | "git_diff")[] } ; isPublic?: boolean ; metadata?: Record<string, unknown> ; projectId?: string ; repoInfo?: { author_email?: null | string ; author_name?: null | string ; branch?: null | string ; commit?: null | string ; commit_message?: null | string ; commit_time?: null | string ; dirty?: null | boolean ; git_diff?: null | string ; tag?: null | string } ; setCurrent?: boolean ; state?: BraintrustState ; update?: boolean } & InitOpenOption<false> & SetCurrentArg> |
Returns
R
Deprecated
Use init instead.
withLogger
▸ withLogger<IsAsyncFlush, R>(callback, options?): R
Type parameters
| Name | Type |
|---|---|
IsAsyncFlush | extends boolean = false |
R | void |
Parameters
| Name | Type |
|---|---|
callback | (logger: Logger<IsAsyncFlush>) => R |
options | Readonly<LoginOptions & { forceLogin?: boolean } & { projectId?: string ; projectName?: string ; setCurrent?: boolean ; state?: BraintrustState } & AsyncFlushArg<IsAsyncFlush> & SetCurrentArg> |
Returns
R
Deprecated
Use initLogger instead.
wrapAISDKModel
▸ wrapAISDKModel<T>(model): T
Wrap an ai-sdk model (created with .chat(), .completion(), etc.) to add tracing. If Braintrust is
not configured, this is a no-op
Type parameters
| Name | Type |
|---|---|
T | extends object |
Parameters
| Name | Type |
|---|---|
model | T |
Returns
T
The wrapped object.
wrapOpenAI
▸ wrapOpenAI<T>(openai): T
Wrap an OpenAI object (created with new OpenAI(...)) to add tracing. If Braintrust is
not configured, this is a no-op
Currently, this only supports the v4 API.
Type parameters
| Name | Type |
|---|---|
T | extends object |
Parameters
| Name | Type |
|---|---|
openai | T |
Returns
T
The wrapped OpenAI object.
wrapOpenAIv4
▸ wrapOpenAIv4<T>(openai): T
Type parameters
| Name | Type |
|---|---|
T | extends OpenAILike |
Parameters
| Name | Type |
|---|---|
openai | T |
Returns
T
wrapTraced
▸ wrapTraced<F, IsAsyncFlush>(fn, args?): IsAsyncFlush extends false ? (...args: Parameters<F>) => Promise<Awaited<ReturnType<F>>> : F
Wrap a function with traced, using the arguments as input and return value as output.
Any functions wrapped this way will automatically be traced, similar to the @traced decorator
in Python. If you want to correctly propagate the function's name and define it in one go, then
you can do so like this:
Now, any calls to myFunc will be traced, and the input and output will be logged automatically.
If tracing is inactive, i.e. there is no active logger or experiment, it's just a no-op.
Type parameters
| Name | Type |
|---|---|
F | extends (...args: any[]) => any |
IsAsyncFlush | extends boolean = true |
Parameters
| Name | Type | Description |
|---|---|---|
fn | F | The function to wrap. |
args? | StartSpanArgs & SetCurrentArg & AsyncFlushArg<IsAsyncFlush> | Span-level arguments (e.g. a custom name or type) to pass to traced. |
Returns
IsAsyncFlush extends false ? (...args: Parameters<F>) => Promise<Awaited<ReturnType<F>>> : F
The wrapped function.
Type Aliases
AnyDataset
Ƭ AnyDataset: Dataset<boolean>
BaseExperiment
Ƭ BaseExperiment<Input, Expected, Metadata>: Object
Type parameters
| Name | Type |
|---|---|
Input | Input |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
Type declaration
| Name | Type |
|---|---|
_phantom? | [Input, Expected, Metadata] |
_type | "BaseExperiment" |
name? | string |
BaseMetadata
Ƭ BaseMetadata: Record<string, unknown> | void
BraintrustStreamChunk
Ƭ BraintrustStreamChunk: z.infer<typeof braintrustStreamChunkSchema>
A chunk of data from a Braintrust stream. Each chunk type matches an SSE event type.
ChatPrompt
Ƭ ChatPrompt: Object
Type declaration
| Name | Type |
|---|---|
messages | OpenAIMessage[] |
tools? | any[] |
CodeOpts
Ƭ CodeOpts<Params, Returns, Fn>: Partial<BaseFnOpts> & { handler: Fn } & Schema<Params, Returns>
Type parameters
| Name | Type |
|---|---|
Params | Params |
Returns | Returns |
Fn | extends GenericFunction<Params, Returns> |
CommentEvent
Ƭ CommentEvent: IdField & { _audit_metadata?: Record<string, unknown> ; _audit_source: Source ; comment: { text: string } ; created: string ; origin: { id: string } } & ParentExperimentIds | ParentProjectLogIds
CompiledPrompt
Ƭ CompiledPrompt<Flavor>: CompiledPromptParams & { span_info?: { metadata: { prompt: { id: string ; project_id: string ; variables: Record<string, unknown> ; version: string } } ; name?: string ; spanAttributes?: Record<any, any> } } & Flavor extends "chat" ? ChatPrompt : Flavor extends "completion" ? CompletionPrompt : {}
Type parameters
| Name | Type |
|---|---|
Flavor | extends "chat" | "completion" |
CompiledPromptParams
Ƭ CompiledPromptParams: Omit<NonNullable<PromptData["options"]>["params"], "use_cache"> & { model: NonNullable<NonNullable<PromptData["options"]>["model"]> }
CompletionPrompt
Ƭ CompletionPrompt: Object
Type declaration
| Name | Type |
|---|---|
prompt | string |
CreateProjectOpts
Ƭ CreateProjectOpts: NameOrId
DatasetRecord
Ƭ DatasetRecord<IsLegacyDataset>: IsLegacyDataset extends true ? LegacyDatasetRecord : NewDatasetRecord
Type parameters
| Name | Type |
|---|---|
IsLegacyDataset | extends boolean = typeof DEFAULT_IS_LEGACY_DATASET |
DefaultMetadataType
Ƭ DefaultMetadataType: void
DefaultPromptArgs
Ƭ DefaultPromptArgs: Partial<CompiledPromptParams & AnyModelParam & ChatPrompt & CompletionPrompt>
EndSpanArgs
Ƭ EndSpanArgs: Object
Type declaration
| Name | Type |
|---|---|
endTime? | number |
EvalCase
Ƭ EvalCase<Input, Expected, Metadata>: { _xact_id?: TransactionId ; id?: string ; input: Input ; tags?: string[] } & Expected extends void ? object : { expected: Expected } & Metadata extends void ? object : { metadata: Metadata }
Type parameters
| Name |
|---|
Input |
Expected |
Metadata |
EvalResult
Ƭ EvalResult<Input, Output, Expected, Metadata>: EvalCase<Input, Expected, Metadata> & { error: unknown ; output: Output ; scores: Record<string, number | null> }
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvalScorer
Ƭ EvalScorer<Input, Output, Expected, Metadata>: (args: EvalScorerArgs<Input, Output, Expected, Metadata>) => OneOrMoreScores | Promise<OneOrMoreScores>
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
Type declaration
▸ (args): OneOrMoreScores | Promise<OneOrMoreScores>
Parameters
| Name | Type |
|---|---|
args | EvalScorerArgs<Input, Output, Expected, Metadata> |
Returns
OneOrMoreScores | Promise<OneOrMoreScores>
EvalScorerArgs
Ƭ EvalScorerArgs<Input, Output, Expected, Metadata>: EvalCase<Input, Expected, Metadata> & { output: Output }
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvalTask
Ƭ EvalTask<Input, Output>: (input: Input, hooks: EvalHooks) => Promise<Output> | (input: Input, hooks: EvalHooks) => Output
Type parameters
| Name |
|---|
Input |
Output |
EvaluatorDef
Ƭ EvaluatorDef<Input, Output, Expected, Metadata>: { evalName: string ; projectName: string } & Evaluator<Input, Output, Expected, Metadata>
Type parameters
| Name | Type |
|---|---|
Input | Input |
Output | Output |
Expected | Expected |
Metadata | extends BaseMetadata = DefaultMetadataType |
EvaluatorFile
Ƭ EvaluatorFile: Object
Type declaration
| Name | Type |
|---|---|
evaluators | { [evalName: string]: { evaluator: EvaluatorDef<unknown, unknown, unknown, BaseMetadata> ; reporter?: ReporterDef<unknown> | string }; } |
functions | CodeFunction<unknown, unknown, GenericFunction<unknown, unknown>>[] |
prompts | CodePrompt[] |
reporters | { [reporterName: string]: ReporterDef<unknown>; } |
ExperimentLogFullArgs
Ƭ ExperimentLogFullArgs: Partial<Omit<OtherExperimentLogFields, "output" | "scores">> & Required<Pick<OtherExperimentLogFields, "output" | "scores">> & Partial<InputField> & Partial<IdField>
ExperimentLogPartialArgs
Ƭ ExperimentLogPartialArgs: Partial<OtherExperimentLogFields> & Partial<InputField>
FullInitOptions
Ƭ FullInitOptions<IsOpen>: { project?: string } & InitOptions<IsOpen>
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean |
FullLoginOptions
Ƭ FullLoginOptions: LoginOptions & { forceLogin?: boolean }
IdField
Ƭ IdField: Object
Type declaration
| Name | Type |
|---|---|
id | string |
InitOptions
Ƭ InitOptions<IsOpen>: FullLoginOptions & { baseExperiment?: string ; baseExperimentId?: string ; dataset?: AnyDataset ; description?: string ; experiment?: string ; gitMetadataSettings?: GitMetadataSettings ; isPublic?: boolean ; metadata?: Record<string, unknown> ; projectId?: string ; repoInfo?: RepoInfo ; setCurrent?: boolean ; state?: BraintrustState ; update?: boolean } & InitOpenOption<IsOpen>
Type parameters
| Name | Type |
|---|---|
IsOpen | extends boolean |
InputField
Ƭ InputField: Object
Type declaration
| Name | Type |
|---|---|
input | unknown |
InvokeReturn
Ƭ InvokeReturn<Stream, Output>: Stream extends true ? BraintrustStream : Output
The return type of the invoke function. Conditionally returns a BraintrustStream
if stream is true, otherwise returns the output of the function using the Zod schema's
type if present.
Type parameters
| Name | Type |
|---|---|
Stream | extends boolean |
Output | Output |
LogCommentFullArgs
Ƭ LogCommentFullArgs: IdField & { _audit_metadata?: Record<string, unknown> ; _audit_source: Source ; comment: { text: string } ; created: string ; origin: { id: string } } & ParentExperimentIds | ParentProjectLogIds
LogFeedbackFullArgs
Ƭ LogFeedbackFullArgs: IdField & Partial<Omit<OtherExperimentLogFields, "output" | "metrics" | "datasetRecordId"> & { comment: string ; source: Source }>
OtherExperimentLogFields
Ƭ OtherExperimentLogFields: Object
Type declaration
| Name | Type |
|---|---|
_async_scoring_control | AsyncScoringControl |
_merge_paths | string[][] |
_skip_async_scoring | boolean |
datasetRecordId | string |
error | unknown |
expected | unknown |
metadata | Record<string, unknown> |
metrics | Record<string, unknown> |
origin | z.infer<typeof objectReferenceSchema> |
output | unknown |
scores | Record<string, number | null> |
tags | string[] |
PromiseUnless
Ƭ PromiseUnless<B, R>: B extends true ? R : Promise<Awaited<R>>
Type parameters
| Name |
|---|
B |
R |
PromptOpts
Ƭ PromptOpts<HasId, HasVersion, HasTools, HasNoTrace>: Partial<Omit<BaseFnOpts, "name">> & { name: string } & HasId extends true ? PromptId : Partial<PromptId> & HasVersion extends true ? PromptVersion : Partial<PromptVersion> & HasTools extends true ? Partial<PromptTools> : {} & HasNoTrace extends true ? Partial<PromptNoTrace> : {} & PromptContents & { model: string ; params?: ModelParams }
Type parameters
| Name | Type |
|---|---|
HasId | extends boolean |
HasVersion | extends boolean |
HasTools | extends boolean = true |
HasNoTrace | extends boolean = true |
PromptRowWithId
Ƭ PromptRowWithId<HasId, HasVersion>: Omit<PromptRow, "log_id" | "org_id" | "project_id" | "id" | "_xact_id"> & Partial<Pick<PromptRow, "project_id">> & HasId extends true ? Pick<PromptRow, "id"> : Partial<Pick<PromptRow, "id">> & HasVersion extends true ? Pick<PromptRow, "_xact_id"> : Partial<Pick<PromptRow, "_xact_id">>
Type parameters
| Name | Type |
|---|---|
HasId | extends boolean = true |
HasVersion | extends boolean = true |
ScorerOpts
Ƭ ScorerOpts<Output, Input, Params, Returns, Fn>: CodeOpts<Exact<Params, ScorerArgs<Output, Input>>, Returns, Fn> | ScorerPromptOpts
Type parameters
| Name | Type |
|---|---|
Output | Output |
Input | Input |
Params | Params |
Returns | Returns |
Fn | extends GenericFunction<Exact<Params, ScorerArgs<Output, Input>>, Returns> |
SerializedBraintrustState
Ƭ SerializedBraintrustState: z.infer<typeof loginSchema>
SetCurrentArg
Ƭ SetCurrentArg: Object
Type declaration
| Name | Type |
|---|---|
setCurrent? | boolean |
SpanContext
Ƭ SpanContext: Object
Type declaration
| Name | Type |
|---|---|
NOOP_SPAN | typeof NOOP_SPAN |
currentSpan | typeof currentSpan |
startSpan | typeof startSpan |
withCurrent | typeof withCurrent |
StartSpanArgs
Ƭ StartSpanArgs: Object
Type declaration
| Name | Type |
|---|---|
event? | StartSpanEventArgs |
name? | string |
parent? | string |
propagatedEvent? | StartSpanEventArgs |
spanAttributes? | Record<any, any> |
startTime? | number |
type? | SpanType |
ToolFunctionDefinition
Ƭ ToolFunctionDefinition: z.infer<typeof toolFunctionDefinitionSchema>
WithTransactionId
Ƭ WithTransactionId<R>: R & { _xact_id: TransactionId }
Type parameters
| Name |
|---|
R |
Variables
LEGACY_CACHED_HEADER
• Const LEGACY_CACHED_HEADER: "x-cached"
NOOP_SPAN
• Const NOOP_SPAN: NoopSpan
X_CACHED_HEADER
• Const X_CACHED_HEADER: "x-bt-cached"
braintrustStreamChunkSchema
• Const braintrustStreamChunkSchema: ZodUnion<[ZodObject<{ data: ZodString ; type: ZodLiteral<"text_delta"> }, "strip", ZodTypeAny, { data: string ; type: "text_delta" }, { data: string ; type: "text_delta" }>, ZodObject<{ data: ZodString ; type: ZodLiteral<"json_delta"> }, "strip", ZodTypeAny, { data: string ; type: "json_delta" }, { data: string ; type: "json_delta" }>, ZodObject<{ data: ZodString ; type: ZodLiteral<"error"> }, "strip", ZodTypeAny, { data: string ; type: "error" }, { data: string ; type: "error" }>]>
projects
• Const projects: ProjectBuilder
toolFunctionDefinitionSchema
• Const toolFunctionDefinitionSchema: z.ZodObject<{ function: z.ZodObject<{ description: z.ZodOptional<z.ZodString> ; name: z.ZodString ; parameters: z.ZodOptional<z.ZodRecord<z.ZodString, z.ZodUnknown>> ; strict: z.ZodOptional<z.ZodBoolean> }, "strip", z.ZodTypeAny, { description?: string ; name: string ; parameters?: Record<string, unknown> ; strict?: boolean }, { description?: string ; name: string ; parameters?: Record<string, unknown> ; strict?: boolean }> ; type: z.ZodLiteral<"function"> }, "strip", z.ZodTypeAny, { function: { description?: string ; name: string ; parameters?: Record<string, unknown> ; strict?: boolean } ; type: "function" }, { function: { description?: string ; name: string ; parameters?: Record<string, unknown> ; strict?: boolean } ; type: "function" }>