TextNerJob Class

Reference

Configuration for AutoML Text NER Job.

Inheritance: azure.ai.ml.entities._job.automl.nlp.automl_nlp_job.AutoMLNLPJob

TextNerJob

Constructor

TextNerJob(*, training_data: Input | None = None, validation_data: Input | None = None, primary_metric: str | None = None, log_verbosity: str | None = None, **kwargs: Any)

Parameters

Name	Description
training_data Required	Optional[Input] Training data to be used for training, defaults to None
validation_data Required	Optional[Input] Validation data to be used for evaluating the trained model, defaults to None
primary_metric Required	Optional[str] The primary metric to be displayed, defaults to None
log_verbosity Required	Optional[str] Log verbosity level, defaults to None

Keyword-Only Parameters

Name	Description
training_data Required
validation_data Required
primary_metric Required
log_verbosity Required

Examples

creating an automl text ner job


   from azure.ai.ml import automl, Input
   from azure.ai.ml.constants import AssetTypes

   text_ner_job = automl.TextNerJob(
       experiment_name="my_experiment",
       compute="my_compute",
       training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
       validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
       tags={"my_custom_tag": "My custom value"},
   )

Methods

dump	Dumps the job content into a file in YAML format.
extend_search_space	Add (a) search space(s) for an AutoML NLP job.
set_data	Define data configuration for NLP job
set_featurization	Define featurization configuration for AutoML NLP job.
set_limits	Define limit configuration for AutoML NLP job
set_sweep	Define sweep configuration for AutoML NLP job
set_training_parameters	Fix certain training parameters throughout the training procedure for all candidates.

dump

Dumps the job content into a file in YAML format.

dump(dest: str | PathLike | IO, **kwargs: Any) -> None

Parameters

Name	Description
dest Required	Union[<xref:PathLike>, str, IO[AnyStr]] The local path or file stream to write the YAML content to. If dest is a file path, a new file will be created. If dest is an open file, the file will be written to directly.

Keyword-Only Parameters

Name	Description
kwargs	dict Additional arguments to pass to the YAML serializer.

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

extend_search_space

Add (a) search space(s) for an AutoML NLP job.

extend_search_space(value: SearchSpace | List[SearchSpace]) -> None

Parameters

Name	Description
value Required	Union[SearchSpace, List[SearchSpace]] either a SearchSpace object or a list of SearchSpace objects with nlp-specific parameters.

Keyword-Only Parameters

Name	Description
kwargs	dict Additional arguments to pass to the YAML serializer.

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

set_data

Define data configuration for NLP job

set_data(*, training_data: Input, target_column_name: str, validation_data: Input) -> None

Keyword-Only Parameters

Name	Description
training_data	Training data
target_column_name	Column name of the target column.
validation_data	Validation data

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

set_featurization

Define featurization configuration for AutoML NLP job.

set_featurization(*, dataset_language: str | None = None) -> None

Keyword-Only Parameters

Name	Description
dataset_language	Language of the dataset, defaults to None

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

set_limits

Define limit configuration for AutoML NLP job

set_limits(*, max_trials: int = 1, max_concurrent_trials: int = 1, max_nodes: int = 1, timeout_minutes: int | None = None, trial_timeout_minutes: int | None = None) -> None

Keyword-Only Parameters

Name	Description
max_trials	Maximum number of AutoML iterations, defaults to 1 default value: 1
max_concurrent_trials	Maximum number of concurrent AutoML iterations, defaults to 1 default value: 1
max_nodes	Maximum number of nodes used for sweep, defaults to 1 default value: 1
timeout_minutes	Timeout for the AutoML job, defaults to None
trial_timeout_minutes	Timeout for each AutoML trial, defaults to None

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

set_sweep

Define sweep configuration for AutoML NLP job

set_sweep(*, sampling_algorithm: str | SamplingAlgorithmType, early_termination: EarlyTerminationPolicy | None = None) -> None

Keyword-Only Parameters

Name	Description
sampling_algorithm	Required. Specifies type of hyperparameter sampling algorithm. Possible values include: "Grid", "Random", and "Bayesian".
early_termination	Optional. early termination policy to end poorly performing training candidates, defaults to None.

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

set_training_parameters

Fix certain training parameters throughout the training procedure for all candidates.

set_training_parameters(*, gradient_accumulation_steps: int | None = None, learning_rate: float | None = None, learning_rate_scheduler: str | NlpLearningRateScheduler | None = None, model_name: str | None = None, number_of_epochs: int | None = None, training_batch_size: int | None = None, validation_batch_size: int | None = None, warmup_ratio: float | None = None, weight_decay: float | None = None) -> None

Keyword-Only Parameters

Name	Description
gradient_accumulation_steps	number of steps over which to accumulate gradients before a backward pass. This must be a positive integer., defaults to None
learning_rate	initial learning rate. Must be a float in (0, 1)., defaults to None
learning_rate_scheduler	the type of learning rate scheduler. Must choose from 'linear', 'cosine', 'cosine_with_restarts', 'polynomial', 'constant', and 'constant_with_warmup'., defaults to None
model_name	the model name to use during training. Must choose from 'bert-base-cased', 'bert-base-uncased', 'bert-base-multilingual-cased', 'bert-base-german-cased', 'bert-large-cased', 'bert-large-uncased', 'distilbert-base-cased', 'distilbert-base-uncased', 'roberta-base', 'roberta-large', 'distilroberta-base', 'xlm-roberta-base', 'xlm-roberta-large', xlnet-base-cased', and 'xlnet-large-cased'., defaults to None
number_of_epochs	the number of epochs to train with. Must be a positive integer., defaults to None
training_batch_size	the batch size during training. Must be a positive integer., defaults to None
validation_batch_size	the batch size during validation. Must be a positive integer., defaults to None
warmup_ratio	ratio of total training steps used for a linear warmup from 0 to learning_rate. Must be a float in [0, 1]., defaults to None
weight_decay	value of weight decay when optimizer is sgd, adam, or adamw. This must be a float in the range [0, 1]., defaults to None

Exceptions

Type	Description
FileExistsError	Raised if dest is a file path and the file already exists.
IOError	Raised if dest is an open file and the file is not writable.

Attributes

base_path

The base path of the resource.

Returns

Type	Description
str	The base path of the resource.

creation_context

The creation context of the resource.

Returns

Type	Description
Optional[SystemData]	The creation metadata for the resource.

featurization

Featurization settings used for NLP job

Returns

Type	Description
NlpFeaturizationSettings	featurization settings

id

The resource ID.

Returns

Type	Description
Optional[str]	The global ID of the resource, an Azure Resource Manager (ARM) ID.

inputs

limits

Limit settings for NLP jobs

Returns

Type	Description
NlpLimitSettings	limit configuration for NLP job

log_files

Job output files.

Returns

Type	Description
Optional[Dict[str, str]]	The dictionary of log names and URLs.

log_verbosity

Log verbosity configuration

Returns

Type	Description
LogVerbosity	the degree of verbosity used in logging

outputs

primary_metric

search_space

Search space(s) to sweep over for NLP sweep jobs

Returns

Type	Description
List[NlpSearchSpace]	list of search spaces to sweep over for NLP jobs

status

The status of the job.

Common values returned include "Running", "Completed", and "Failed". All possible values are:

NotStarted - This is a temporary state that client-side Run objects are in before cloud submission.
Starting - The Run has started being processed in the cloud. The caller has a run ID at this point.
Provisioning - On-demand compute is being created for a given job submission.
Preparing - The run environment is being prepared and is in one of two stages:
- Docker image build
- conda environment setup
Queued - The job is queued on the compute target. For example, in BatchAI, the job is in a queued state

while waiting for all the requested nodes to be ready.
Running - The job has started to run on the compute target.
Finalizing - User code execution has completed, and the run is in post-processing stages.
CancelRequested - Cancellation has been requested for the job.
Completed - The run has completed successfully. This includes both the user code execution and run

post-processing stages.
Failed - The run failed. Usually the Error property on a run will provide details as to why.
Canceled - Follows a cancellation request and indicates that the run is now successfully cancelled.
NotResponding - For runs that have Heartbeats enabled, no heartbeat has been recently sent.

Returns

Type	Description
Optional[str]	Status of the job.

studio_url

Azure ML studio endpoint.

Returns

Type	Description
Optional[str]	The URL to the job details page.

sweep

Sweep settings used for NLP job

Returns

Type	Description
NlpSweepSettings	sweep settings

task_type

Get task type.

Returns

Type	Description
str	The type of task to run. Possible values include: "classification", "regression", "forecasting".

test_data

Get test data.

Returns

Type	Description
Input	Test data input

training_data

Get training data.

Returns

Type	Description
Input	Training data input

training_parameters

Parameters that are used for all submitted jobs.

Returns

Type	Description
NlpFixedParameters	fixed training parameters for NLP jobs

type

The type of the job.

Returns

Type	Description
Optional[str]	The type of the job.

validation_data

Get validation data.

Returns

Type	Description
Input	Validation data input

TextNerJob Class

Constructor

Parameters

Keyword-Only Parameters

Examples

Methods

dump

Parameters

Keyword-Only Parameters

Exceptions

extend_search_space

Parameters

Keyword-Only Parameters

Exceptions

set_data

Keyword-Only Parameters

Exceptions

set_featurization

Keyword-Only Parameters

Exceptions

set_limits

Keyword-Only Parameters

Exceptions

set_sweep

Keyword-Only Parameters

Exceptions

set_training_parameters

Keyword-Only Parameters

Exceptions

Attributes

base_path

Returns

creation_context

Returns

featurization

Returns

id

Returns

inputs

limits

Returns

log_files

Returns

log_verbosity

Returns

outputs

primary_metric

search_space

Returns

status

Returns

studio_url

Returns

sweep

Returns

task_type

Returns

test_data

Returns

training_data

Returns

training_parameters

Returns

type

Returns

validation_data

Returns

Feedback

Additional resources