oumi.analyze

oumi.analyze#

Analyzer framework for dataset analysis.

Bases: object

Pipeline for orchestrating multiple analyzers on a dataset.

The AnalysisPipeline manages running multiple analyzers on conversations, handling different analyzer scopes appropriately, and providing unified access to results.

Note

PreferenceAnalyzers are not run by run(). Use run_preference() separately to analyze preference pairs (chosen/rejected conversations).

Example

>>> from oumi.analyze import AnalysisPipeline, LengthAnalyzer
>>>
>>> pipeline = AnalysisPipeline(
...     analyzers=[
...         LengthAnalyzer.from_config({"tokenizer_name": "cl100k_base"})
...     ],
...     cache_dir="./analysis_cache",
... )
>>> results = pipeline.run(conversations)

Parameters:

analyzers – List of analyzer instances to run.
cache_dir – Optional directory for caching results.

property conversations: list[Conversation]#: Get the analyzed conversations.

get_analyzer(name: str) → MessageAnalyzer[Any] | ConversationAnalyzer[Any] | DatasetAnalyzer[Any] | PreferenceAnalyzer[Any] | None[source]#: Get an analyzer by name, or None if not found.

load_cache() → bool[source]#

Load results from cache directory.

Note

Loaded results are raw dictionaries, not Pydantic model instances. Use get_cached_result() to reconstruct typed results if needed, or access raw data directly via self.results.

Returns:: True if cache was loaded successfully, False otherwise.

property message_to_conversation_idx: list[int]#: Get the mapping from message index to conversation index.

property results: dict[str, list[BaseModel] | BaseModel]#: Get the cached analysis results.

run(conversations: list[Conversation]) → dict[str, list[BaseModel] | BaseModel][source]#

Run all analyzers on the provided conversations.

Note

PreferenceAnalyzers are not run by this method. Use run_preference() separately to analyze preference pairs.

Parameters:: conversations – List of conversations to analyze.
Returns:: Dictionary mapping analyzer names to their results. - For ConversationAnalyzer: list of results (one per conversation) - For MessageAnalyzer: list of results (one per message) - For DatasetAnalyzer: single result for entire dataset

run_preference(pairs: list[tuple[Conversation, Conversation]]) → dict[str, list[BaseModel] | BaseModel][source]#

Run preference analyzers on conversation pairs.

Parameters:: pairs – List of (chosen, rejected) conversation tuples.
Returns:: Dictionary mapping analyzer names to their results.

to_dataframe() → pd.DataFrame[source]#

Convert cached results to a pandas DataFrame.

Returns:: DataFrame with one row per conversation, columns for each metric.
Raises:: RuntimeError – If no results are cached (run() not called).

class oumi.analyze.AnalyzerConfig(type: str = '', id: str = '', display_name: str = '', params: dict[str, ~typing.Any]=<factory>)[source]#

Bases: object

Configuration for a single analyzer instance.

Three identity-related fields:

type — registry id (e.g. "length"). Picks the analyzer class.
id — stable identity. Canonical key for results, caches, and test metric paths. Defaults to display_name when omitted.
display_name — human-readable label for UI and logs. Defaults to type when omitted. May repeat across analyzers.

When callers don’t set id or display_name, all three collapse to type and today’s behavior is preserved. When the API populates id with a generated asset id, display_name becomes purely cosmetic.

Variables:

type (str) – Analyzer type (registry id, e.g. “length”, “difficulty_judge”).
id (str) – Stable identifier. Used as the results key and in test metric paths. Defaults to display_name when omitted.
display_name (str) – Human-readable label. Defaults to type when omitted.
params (dict[str, Any]) – Analyzer-specific parameters.

__post_init__()[source]#: Validate and default the analyzer configuration.

display_name: str = ''#

id: str = ''#

params: dict[str, Any]#

type: str = ''#

class oumi.analyze.BaseAnalyzer[source]#

Bases: ABC, Generic[TResult]

Base class for all analyzer types.

Subclasses must implement metadata methods to describe their result schema. The generic type parameter TResult provides type safety for the analyze() method.

All concrete analyzer types (MessageAnalyzer, ConversationAnalyzer, etc.) inherit from this class. Set _result_model in subclasses to get automatic implementations of get_result_schema, get_metric_names, and get_metric_descriptions.

Variables:: analyzer_id (str | None) – Optional custom identifier for this analyzer instance. If not set, the class name is used as the identifier.

analyzer_id: str | None = None#

get_available_metric_names() → list[str][source]#

Get metric names this instance will actually produce.

Subclasses can override to exclude metrics that depend on instance config (e.g., rendered_tokens requires a HuggingFace tokenizer).

abstractmethod classmethod get_config_schema() → dict[str, Any][source]#: Get JSON schema for this analyzer’s configuration.

classmethod get_metric_descriptions() → dict[str, str][source]#: Get descriptions for each metric field.

classmethod get_metric_names() → list[str][source]#: Get the list of metric field names this analyzer produces.

classmethod get_result_schema() → dict[source]#: Get the JSON schema for this analyzer’s result model.

classmethod get_scope() → str[source]#: Get the scope of this analyzer.

static get_text_content(message: Message) → str[source]#

Extract a text-content proxy from a message for analysis.

Returns an approximate, model-agnostic string representation suitable for length, regex, and rough token-count analyses — NOT a faithful representation of what a model’s tokenizer will see. Different models render tool calls to wildly different token counts via their chat templates (Qwen <tool_call>..., Mistral [TOOL_CALLS], Llama JSON, Hermes XML, etc.); this helper can’t and doesn’t try to replicate that. For exact, model-specific counts, apply the model’s tokenizer to tokenizer.apply_chat_template(messages, tools=...) rather than using this proxy.

The proxy concatenates, separated by single spaces:

Simple str content (returned as-is)
Multimodal content lists (text items joined)
tool_calls (JSON-stringified via OpenAI wire format)

Including tool_calls matters because tool-only assistant turns (content=None) would otherwise contribute zero to every length-based metric, systematically under-counting tool-heavy datasets. The JSON form gives them a representative size.

Parameters:: message – The message to extract text from.
Returns:: The combined text proxy as a string. Empty if the message has neither content nor tool_calls.

class oumi.analyze.ConversationAnalyzer[source]#

Bases: BaseAnalyzer[TResult]

Base class for analyzers that operate on complete conversations.

__call__(conversation: Conversation) → TResult[source]#: Call analyze() directly.

abstractmethod analyze(conversation: Conversation) → TResult[source]#

Analyze a complete conversation and return typed results.

Parameters:: conversation – The conversation to analyze.
Returns:: Typed result model containing analysis metrics.

analyze_batch(conversations: list[Conversation]) → list[TResult][source]#

Analyze multiple conversations and return results for each.

Override this method to implement batched processing for better performance, especially for analyzers that benefit from batching (e.g., those using ML models).

Parameters:: conversations – List of conversations to analyze.
Returns:: List of typed results, one per conversation.

static get_conversation_text(conversation: Conversation, tokenizer: PreTrainedTokenizerBase) → str[source]#

Get the full text of a conversation using a tokenizer’s chat template.

Parameters:

conversation – The conversation to extract text from.
tokenizer – Tokenizer with a chat template for formatting.

Returns:

Full conversation text as a single string.

Raises:

ValueError – If the tokenizer doesn’t have a chat template.

classmethod get_scope() → str[source]#: Get the scope of this analyzer.

class oumi.analyze.DataQualityAnalyzer[source]#

Bases: ConversationAnalyzer[DataQualityMetrics]

Analyzer for basic data quality checks on conversations.

Checks for five common data quality issues without requiring an LLM: - Non-alternating user/assistant message patterns - Missing user messages - System messages not at the start of the conversation - Empty or whitespace-only turns - Values serialized as strings (NaN, null, None, undefined)

Example

>>> from oumi.analyze.analyzers.quality import DataQualityAnalyzer
>>> from oumi.core.types.conversation import Conversation, Message, Role
>>>
>>> analyzer = DataQualityAnalyzer()
>>> conversation = Conversation(messages=[
...     Message(role=Role.USER, content="Hello"),
...     Message(role=Role.ASSISTANT, content="Hi there!"),
... ])
>>> result = analyzer.analyze(conversation)
>>> print(result.has_non_alternating_turns)
False

analyze(conversation: Conversation) → DataQualityMetrics[source]#

Analyze data quality for a conversation.

Parameters:: conversation – The conversation to analyze.
Returns:: DataQualityMetrics with the quality check results.

classmethod get_config_schema() → dict[source]#: Get JSON schema for DataQualityAnalyzer configuration.

class oumi.analyze.DataQualityMetrics(*, has_non_alternating_turns: bool, has_no_user_message: bool, has_system_message_not_at_start: bool, has_empty_turns: bool, empty_turn_count: int, has_invalid_values: bool, invalid_value_patterns: list[str])[source]#

Bases: BaseModel

Result model for data quality checks on a conversation.

Example

>>> result = DataQualityMetrics(
...     has_non_alternating_turns=False,
...     has_no_user_message=False,
...     has_system_message_not_at_start=False,
...     has_empty_turns=False,
...     empty_turn_count=0,
...     has_invalid_values=False,
...     invalid_value_patterns=[],
... )
>>> print(result.has_non_alternating_turns)
False

empty_turn_count: int#

has_empty_turns: bool#

has_invalid_values: bool#

has_no_user_message: bool#

has_non_alternating_turns: bool#

has_system_message_not_at_start: bool#

invalid_value_patterns: list[str]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class oumi.analyze.DatasetAnalyzer[source]#

Bases: BaseAnalyzer[TResult]

Base class for analyzers that operate on entire datasets.

__call__(conversations: list[Conversation]) → TResult[source]#: Call analyze() directly.

abstractmethod analyze(conversations: list[Conversation]) → TResult[source]#

Analyze an entire dataset and return typed results.

This method receives all conversations at once, enabling cross-sample operations that require global context.

Parameters:: conversations – All conversations in the dataset.
Returns:: Typed result model containing dataset-level analysis.

classmethod get_scope() → str[source]#: Get the scope of this analyzer.

class oumi.analyze.LengthAnalyzer(tokenizer: Tokenizer | None = None)[source]#

Bases: ConversationAnalyzer[LengthMetrics]

Analyzer for computing token length metrics of conversations.

Computes token counts for conversations using a provided tokenizer. Provides both conversation-level totals and per-message breakdowns.

Example

>>> from oumi.analyze.analyzers.length import LengthAnalyzer
>>> from oumi.core.types.conversation import Conversation, Message, Role
>>>
>>> analyzer = LengthAnalyzer.from_config({"tokenizer_name": "cl100k_base"})
>>> conversation = Conversation(messages=[
...     Message(role=Role.USER, content="Hello, how are you?"),
...     Message(role=Role.ASSISTANT, content="I'm doing well, thanks!"),
... ])
>>> result = analyzer.analyze(conversation)
>>> print(f"Total tokens: {result.total_tokens}")
Total tokens: 12

Parameters:: tokenizer – Tokenizer instance for token counting. Must have an encode(text) -> list method. Use from_config() to construct from a tokenizer name, or pass any compatible tokenizer directly.

analyze(conversation: Conversation) → LengthMetrics[source]#

Analyze token length metrics for a conversation.

Parameters:: conversation – The conversation to analyze.
Returns:: LengthMetrics containing token counts.

analyze_text(text: str) → LengthMetrics[source]#

Analyze token length metrics for a single text string.

Convenience method for analyzing text without creating a Conversation.

Parameters:: text – The text to analyze.
Returns:: LengthMetrics for the text (treated as a single message).

classmethod from_config(config: dict[str, Any]) → LengthAnalyzer[source]#

Create a LengthAnalyzer from a config dictionary.

Parameters:: config – See LengthAnalyzerConfig for supported keys.
Returns:: LengthAnalyzer instance with configured tokenizer.

get_available_metric_names() → list[str][source]#

Return metrics this instance will produce.

Excludes rendered_tokens when the tokenizer doesn’t support apply_chat_template (i.e. tiktoken or no tokenizer).

classmethod get_config_schema() → dict[str, Any][source]#: Get JSON schema for this analyzer’s configuration.

class oumi.analyze.LengthAnalyzerConfig(*, tokenizer_name: str = 'cl100k_base', trust_remote_code: bool = False)[source]#

Bases: BaseModel

Configuration for LengthAnalyzer.

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

tokenizer_name: str#

trust_remote_code: bool#

class oumi.analyze.LengthMetrics(*, total_tokens: int, rendered_tokens: int | None = None, avg_tokens_per_message: float, message_token_counts: list[int], num_messages: int, user_total_tokens: int = 0, assistant_total_tokens: int = 0, system_total_tokens: int = 0, tool_total_tokens: int = 0)[source]#

Bases: BaseModel

Result model for length analysis of conversations.

Example

>>> result = LengthMetrics(
...     total_tokens=25,
...     avg_tokens_per_message=12.5,
...     message_token_counts=[10, 15],
...     num_messages=2,
... )
>>> print(result.total_tokens)
25

assistant_total_tokens: int#

avg_tokens_per_message: float#

message_token_counts: list[int]#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_messages: int#

rendered_tokens: int | None#

system_total_tokens: int#

tool_total_tokens: int#

total_tokens: int#

user_total_tokens: int#

class oumi.analyze.MessageAnalyzer[source]#

Bases: BaseAnalyzer[TResult]

Base class for analyzers that operate on individual messages.

__call__(message: Message) → TResult[source]#: Call analyze() directly.

abstractmethod analyze(message: Message) → TResult[source]#

Analyze a single message and return typed results.

Parameters:: message – The message to analyze.
Returns:: Typed result model containing analysis metrics.

analyze_batch(messages: list[Message]) → list[TResult][source]#

Analyze multiple messages and return results for each.

Override this method to implement vectorized/batched processing for better performance with large datasets.

Parameters:: messages – List of messages to analyze.
Returns:: List of typed results, one per message.

classmethod get_scope() → str[source]#: Get the scope of this analyzer.

class oumi.analyze.PreferenceAnalyzer[source]#

Bases: BaseAnalyzer[TResult]

Base class for analyzers that operate on preference pairs.

__call__(chosen: Conversation, rejected: Conversation) → TResult[source]#: Call analyze() directly.

abstractmethod analyze(chosen: Conversation, rejected: Conversation) → TResult[source]#

Analyze a preference pair and return typed results.

Parameters:

chosen – The preferred/chosen conversation.
rejected – The rejected/dispreferred conversation.

Returns:

Typed result model containing preference analysis.

analyze_batch(pairs: list[tuple[Conversation, Conversation]]) → list[TResult][source]#

Analyze multiple preference pairs.

Parameters:: pairs – List of (chosen, rejected) conversation tuples.
Returns:: List of typed results, one per pair.

classmethod get_scope() → str[source]#: Get the scope of this analyzer.

class oumi.analyze.TestEngine(tests: list[TestParams])[source]#

Bases: object

Engine for running tests on typed analysis results.

Tests operate on typed Pydantic results, not DataFrames. This ensures tests are pure validation with no computation - all metrics must be pre-computed by analyzers.

Example

>>> from oumi.analyze.testing import TestEngine, TestParams, TestType
>>>
>>> tests = [
...     TestParams(
...         id="max_words",
...         type=TestType.THRESHOLD,
...         metric="length.total_tokens",
...         operator=">",
...         value=10000,
...         max_percentage=5.0,
...         severity=TestSeverity.MEDIUM,
...     ),
... ]
>>> engine = TestEngine(tests)
>>> summary = engine.run(results)
>>> print(f"Pass rate: {summary.pass_rate}%")

Parameters:: tests – List of test configurations.

run(results: dict[str, list[BaseModel] | BaseModel]) → TestSummary[source]#

Run all tests on the analysis results.

Parameters:: results – Dictionary mapping analyzer names to results.
Returns:: TestSummary containing all test results.

class oumi.analyze.TestResult(*, test_id: str, passed: bool, severity: TestSeverity = TestSeverity.MEDIUM, title: str = '', description: str = '', metric: str = '', affected_count: int = 0, total_count: int = 0, affected_percentage: float = 0.0, threshold: float | None = None, actual_value: float | None = None, sample_indices: list[int] = <factory>, all_affected_indices: list[int] = <factory>, error: str | None = None, details: dict[str, ~typing.Any]=<factory>)[source]#

Bases: BaseModel

Result of a single test execution.

Variables:

test_id (str) – Unique identifier for the test.
passed (bool) – Whether the test passed.
severity (oumi.core.configs.params.test_params.TestSeverity) – Severity level of the test.
title (str) – Human-readable title.
description (str) – Description of what the test checks.
metric (str) – The metric being tested (e.g., “analyzer_name.field”).
affected_count (int) – Number of samples that failed the test.
total_count (int) – Total number of samples tested.
affected_percentage (float) – Percentage of samples affected.
threshold (float | None) – The configured threshold for the test.
actual_value (float | None) – The actual computed value (for threshold tests).
sample_indices (list[int]) – Indices of affected samples (limited).
error (str | None) – Error message if test execution failed.
details (dict[str, Any]) – Additional details about the test result.

actual_value: float | None#

affected_count: int#

affected_percentage: float#

all_affected_indices: list[int]#

description: str#

details: dict[str, Any]#

error: str | None#

metric: str#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

passed: bool#

sample_indices: list[int]#

severity: TestSeverity#

test_id: str#

threshold: float | None#

title: str#

to_dict() → dict[str, Any][source]#: Convert to dictionary representation.

total_count: int#

class oumi.analyze.TestSummary(*, results: list[TestResult] = <factory>, total_tests: int = 0, passed_tests: int = 0, failed_tests: int = 0, error_tests: int = 0, pass_rate: float = 0.0, high_severity_failures: int = 0, medium_severity_failures: int = 0, low_severity_failures: int = 0)[source]#

Bases: BaseModel

Summary of all test results.

Variables:

results (list[oumi.analyze.testing.results.TestResult]) – List of individual test results.
total_tests (int) – Total number of tests run.
passed_tests (int) – Number of tests that passed.
failed_tests (int) – Number of tests that failed.
error_tests (int) – Number of tests that had errors.
pass_rate (float) – Percentage of tests that passed.
high_severity_failures (int) – Number of high severity failures.
medium_severity_failures (int) – Number of medium severity failures.
low_severity_failures (int) – Number of low severity failures.

error_tests: int#

failed_tests: int#

classmethod from_results(results: list[TestResult]) → TestSummary[source]#

Create a summary from a list of test results.

Parameters:: results – List of test results.
Returns:: TestSummary with computed statistics.

get_error_results() → list[TestResult][source]#: Get all test results with errors.

get_failed_results() → list[TestResult][source]#: Get all failed test results.

get_passed_results() → list[TestResult][source]#: Get all passed test results.

high_severity_failures: int#

low_severity_failures: int#

medium_severity_failures: int#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pass_rate: float#

passed_tests: int#

results: list[TestResult]#

to_dict() → dict[str, Any][source]#: Convert to dictionary representation.

total_tests: int#

class oumi.analyze.TurnStatsAnalyzer[source]#

Bases: ConversationAnalyzer[TurnStatsMetrics]

Analyzer for computing turn statistics of conversations.

Computes turn counts and per-role statistics to help understand conversation structure and balance.

Example

>>> from oumi.analyze.analyzers.turn_stats import TurnStatsAnalyzer
>>> from oumi.core.types.conversation import Conversation, Message, Role
>>>
>>> analyzer = TurnStatsAnalyzer()
>>> conversation = Conversation(messages=[
...     Message(role=Role.USER, content="What is Python?"),
...     Message(
...         role=Role.ASSISTANT,
...         content="Python is a programming language.",
...     ),
... ])
>>> result = analyzer.analyze(conversation)
>>> print(f"Turns: {result.num_turns}")
Turns: 2

analyze(conversation: Conversation) → TurnStatsMetrics[source]#

Analyze turn statistics for a conversation.

Parameters:: conversation – The conversation to analyze.
Returns:: TurnStatsMetrics containing turn counts and statistics.

classmethod get_config_schema() → dict[source]#: Get JSON schema for TurnStatsAnalyzer configuration.

class oumi.analyze.TurnStatsMetrics(*, num_turns: int, num_user_turns: int, num_assistant_turns: int, num_tool_turns: int = 0, has_system_message: bool, first_turn_role: str | None = None, last_turn_role: str | None = None)[source]#

Bases: BaseModel

Result model for turn statistics analysis of conversations.

Example

>>> result = TurnStatsMetrics(
...     num_turns=4,
...     num_user_turns=2,
...     num_assistant_turns=2,
...     has_system_message=False,
...     first_turn_role="user",
...     last_turn_role="assistant",
... )
>>> print(result.num_turns)
4

first_turn_role: str | None#

has_system_message: bool#

last_turn_role: str | None#

model_config = {}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_assistant_turns: int#

num_tool_turns: int#

num_turns: int#

num_user_turns: int#

class oumi.analyze.TypedAnalyzeConfig(eval_name: str | None = None, parent_eval_id: str | None = None, dataset_name: str | None = None, dataset_path: str | None = None, split: str = 'train', subset: str | None = None, sample_count: int | None = None, output_path: str | None = None, analyzers: list[AnalyzerConfig] = <factory>, custom_metrics: list[CustomMetricConfig] = <factory>, tests: list[TestParams] = <factory>, tokenizer_name: str | None = None, tokenizer_kwargs: dict[str, ~typing.Any]=<factory>, generate_report: bool = False, report_title: str | None = None)[source]#

Bases: object

Configuration for the typed analyzer pipeline.

This is the main configuration class for the new typed analyzer architecture. It supports both programmatic construction and loading from YAML files.

Example YAML:

dataset_path: /path/to/data.jsonl
sample_count: 1000
output_path: ./analysis_output

analyzers:
  - type: length
    params:
      count_tokens: true
  - type: quality

custom_metrics:
  - id: turn_pattern
    scope: conversation
    function: |
      def compute(conversation):
          ...

tests:
  - id: max_words
    type: threshold
    metric: length.total_words
    operator: ">"
    value: 10000
    max_percentage: 5.0

Variables:

dataset_name (str | None) – Name of the dataset (HuggingFace identifier).
dataset_path (str | None) – Path to local dataset file.
split (str) – Dataset split to use.
sample_count (int | None) – Number of samples to analyze.
output_path (str | None) – Directory for output artifacts.
analyzers (list[oumi.analyze.config.AnalyzerConfig]) – List of analyzer configurations.
custom_metrics (list[oumi.analyze.config.CustomMetricConfig]) – List of custom metric configurations.
tests (list[oumi.core.configs.params.test_params.TestParams]) – List of test configurations.
tokenizer_name (str | None) – Tokenizer for token counting.
generate_report (bool) – Whether to generate HTML report.
report_title (str | None) – Custom title for the report.

analyzers: list[AnalyzerConfig]#

custom_metrics: list[CustomMetricConfig]#

dataset_name: str | None = None#

dataset_path: str | None = None#

eval_name: str | None = None#

classmethod from_dict(data: dict[str, Any], allow_custom_code: bool = False) → TypedAnalyzeConfig[source]#

Create configuration from a dictionary.

Parameters:

data – Configuration dictionary.
allow_custom_code – If True, allow custom_metrics with function code. If False (default) and the config contains custom metrics with code, raises ValueError.

Returns:

TypedAnalyzeConfig instance.

Raises:

ValueError – If config contains custom code but allow_custom_code=False, or if duplicate analyzer id values are found.

classmethod from_yaml(path: str | Path, allow_custom_code: bool = False) → TypedAnalyzeConfig[source]#

Load configuration from a YAML file.

Warning

Security Warning: If the YAML file contains custom_metrics with function fields, arbitrary Python code will be loaded. Only load configurations from trusted sources. Set allow_custom_code=True to explicitly acknowledge this risk.

Parameters:

path – Path to YAML configuration file.
allow_custom_code – If True, allow loading custom_metrics with function code. If False (default) and the config contains custom metrics with code, raises ValueError.

Returns:

TypedAnalyzeConfig instance.

Raises:

ValueError – If config contains custom code but allow_custom_code=False.

generate_report: bool = False#

output_path: str | None = None#

parent_eval_id: str | None = None#

report_title: str | None = None#

sample_count: int | None = None#

split: str = 'train'#

subset: str | None = None#

tests: list[TestParams]#

to_dict() → dict[str, Any][source]#: Convert configuration to a dictionary.

tokenizer_kwargs: dict[str, Any]#

tokenizer_name: str | None = None#

oumi.analyze.create_analyzer_from_config(analyzer_id: str, params: dict) → MessageAnalyzer | ConversationAnalyzer | DatasetAnalyzer | None[source]#

Create an analyzer instance from configuration.

Prefers using the analyzer’s from_config() classmethod if available, otherwise falls back to direct instantiation with **params.

Parameters:

analyzer_id – Analyzer type identifier.
params – Analyzer-specific parameters.

Returns:

Analyzer instance or None if not found.

oumi.analyze.describe_analyzer(analyzer_class: type) → str[source]#: Get a human-readable description of an analyzer’s metrics.

oumi.analyze.get_analyzer_class(name: str) → type | None[source]#

Get an analyzer class by name.

Parameters:: name – Name of the analyzer.
Returns:: The analyzer class or None if not found.

oumi.analyze.get_analyzer_info(analyzer_class: type) → dict[str, Any][source]#: Get detailed information about an analyzer’s output metrics.

oumi.analyze.get_instance_metrics(analyzer_class: type, config: dict[str, Any] | None = None) → list[str][source]#: Get available metrics, attempting to instantiate with config for filtering.

oumi.analyze.list_available_metrics(include_duplicates: bool = False) → dict[str, dict[str, Any]][source]#: List all available metrics from registered analyzers.

oumi.analyze.print_analyzer_metrics(analyzer_name: str | None = None) → None[source]#

Pretty print available metrics for analyzers.

Parameters:: analyzer_name – Optional specific analyzer to show. If None, shows all.

oumi.analyze.register_analyzer(registry_name: str) → Callable#

Returns function to register a sample analyzer in the Oumi global registry.

Parameters:: registry_name – The name that the sample analyzer should be registered with.
Returns:: Decorator function to register the target sample analyzer.

oumi.analyze.to_analysis_dataframe(conversations: list[Conversation], results: Mapping[str, Sequence[BaseModel] | BaseModel], message_to_conversation_idx: list[int] | None = None) → DataFrame[source]#

Convert typed analysis results to a pandas DataFrame.

Creates a DataFrame with one row per conversation, with columns for conversation metadata and all analyzer metrics. Analyzer field names are prefixed with the analyzer name to avoid collisions.

Example

>>> results = {"LengthAnalyzer": [LengthMetrics(...), LengthMetrics(...)]}
>>> df = to_analysis_dataframe(conversations, results)
>>> print(df.columns.tolist())
['conversation_id', 'conversation_index', 'num_messages',
 'length__total_chars', 'length__total_words', ...]

Parameters:

conversations – List of conversations that were analyzed.
results – Dictionary mapping analyzer names to results. - For per-conversation results: list of BaseModel (len = num conversations) - For message-level results: list of BaseModel (len = num messages) - For dataset-level results: single BaseModel (will be repeated)
message_to_conversation_idx – Optional mapping from message index to conversation index. Required for proper aggregation of message-level results. If provided, message-level results will be aggregated per conversation.

Returns:

DataFrame with conversation metadata and all metrics as columns.

oumi.analyze

Contents

oumi.analyze#

Subpackages#