From 1aac1b8661f7bb0497321a6fa161eabd96fc416d Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Thu, 30 Jan 2025 13:24:13 -0500
Subject: [PATCH 1/7] Add media selectors guide.
---
docs/docs/Media-Selectors-Guide.md | 195 +++++++
docs/mkdocs.yml | 1 +
docs/site/404.html | 4 +
docs/site/Acknowledgements/index.html | 4 +
docs/site/Admin-Guide/index.html | 4 +
docs/site/CPP-Batch-Component-API/index.html | 4 +
.../CPP-Streaming-Component-API/index.html | 4 +
docs/site/Component-API-Overview/index.html | 4 +
.../Component-Descriptor-Reference/index.html | 4 +
docs/site/Contributor-Guide/index.html | 4 +
docs/site/Derivative-Media-Guide/index.html | 4 +
.../Development-Environment-Guide/index.html | 4 +
docs/site/Feed-Forward-Guide/index.html | 4 +
docs/site/GPU-Support-Guide/index.html | 4 +
docs/site/Health-Check-Guide/index.html | 4 +
docs/site/Install-Guide/index.html | 4 +
docs/site/Java-Batch-Component-API/index.html | 4 +
docs/site/License-And-Distribution/index.html | 4 +
docs/site/Markup-Guide/index.html | 4 +
docs/site/Media-Segmentation-Guide/index.html | 4 +
docs/site/Media-Selectors-Guide/index.html | 520 ++++++++++++++++++
docs/site/Node-Guide/index.html | 4 +
docs/site/Object-Storage-Guide/index.html | 4 +
docs/site/OpenID-Connect-Guide/index.html | 4 +
.../Python-Batch-Component-API/index.html | 4 +
docs/site/Quality-Selection-Guide/index.html | 8 +-
docs/site/REST-API/index.html | 8 +-
docs/site/Release-Notes/index.html | 4 +
docs/site/Roll-Up-Guide/index.html | 4 +
docs/site/TiesDb-Guide/index.html | 4 +
docs/site/Trigger-Guide/index.html | 4 +
docs/site/User-Guide/index.html | 4 +
.../Workflow-Manager-Architecture/index.html | 4 +
docs/site/index.html | 6 +-
docs/site/search.html | 4 +
docs/site/search/search_index.json | 44 +-
docs/site/sitemap.xml | 65 ++-
37 files changed, 926 insertions(+), 37 deletions(-)
create mode 100644 docs/docs/Media-Selectors-Guide.md
create mode 100644 docs/site/Media-Selectors-Guide/index.html
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
new file mode 100644
index 000000000000..a523ebfd5ff6
--- /dev/null
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -0,0 +1,195 @@
+**NOTICE:** This software (or technical data) was produced for the U.S. Government under contract,
+and is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025
+The MITRE Corporation. All Rights Reserved.
+
+# Media Selectors Overview
+
+Media selectors allow users to specify that only specific sections of a document should be
+processed. A copy of the input file with the specified sections replaced by component output is
+produced.
+
+
+# New Job Request Fields
+
+Below is an example of a job that uses media selectors. The job uses a two stage pipeline.
+The first stage performs language identification. The second performs translation.
+```json
+{
+ "algorithmProperties": {},
+ "buildOutput": true,
+ "jobProperties": {},
+ "media": [
+ {
+ "mediaUri": "file:///opt/mpf/share/remote-media/test-json-path-translation.json",
+ "properties": {},
+ "mediaSelectorsOutputAction": "ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION",
+ "mediaSelectors": [
+ {
+ "type": "JSON_PATH",
+ "expression": "$.spanishMessages.*.content",
+ "resultDetectionProperty": "TRANSLATION",
+ "selectionProperties": {}
+ },
+ {
+ "type": "JSON_PATH",
+ "expression": "$.chineseMessages.*.content",
+ "resultDetectionProperty": "TRANSLATION",
+ "selectionProperties": {}
+ }
+ ]
+ }
+ ],
+ "pipelineName": "ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE",
+ "priority": 4
+}
+```
+- `$.media.*.mediaSelectorsOutputAction`: Name of the action that produces content for the media
+ selectors output file. In the above example, we specify that we want the translated content
+ from Argos in the media selectors output file rather than the detected language from the first
+ stage.
+- `$.media.*.mediaSelectors`: List of media selectors that will be used for the media.
+- `$.media.*.mediaSelectors.*.type`: The name of the type of media selector that is used in the
+ `expression` field.
+- `$.media.*.mediaSelectors.*.resultDetectionProperty`: A detection property name from tracks
+ produced by the `mediaSelectorsOutputAction`. The media selectors output document will be
+ populated with the content of the specified property.
+- `$.media.*.mediaSelectors.*.selectionProperties`: Job properties that will only be used for
+ sub-jobs created for a specific media selector.
+
+
+# New Job Properties
+- `MEDIA_SELECTORS_DELIMETER`: When not provided and a job uses media selectors, the selected parts
+ of the document will be replaced with the action output. When provided, the selected parts of
+ the document will contain the original content, followed by the value of this property, and
+ finally the action output.
+- `MEDIA_SELECTORS_DUPLICATE_POLICY`: Specifies how to handle the case where a job uses media
+ selectors and there are multiple outputs for a single selection. When set to `LONGEST`, the
+ longer of the two outputs is chosen and the shorter one is discarded. When set to `ERROR`,
+ duplicates are considered an error. When set to `JOIN`, the duplicates are combined using
+ ` | ` as a delimiter.
+- `MEDIA_SELECTORS_NO_MATCHES_IS_ERROR`: When true and a job uses media selectors, an error will be
+ generated when none of the selectors match content from the media.
+
+
+# Media Selector Types
+
+`JSON_PATH` is only type currently supported, but others are planned.
+
+
+## JSON_PATH
+
+Used to extract content for JSON files. Uses the "Jayway JsonPath" library to parse the expressions.
+The specific syntax supported is available on their
+[GitHub page](https://github.com/json-path/JsonPath?tab=readme-ov-file#operators).
+
+When extracting content from the document, only strings, arrays, and objects are considered. All
+other JSON types are ignored. When the JsonPath expression matches an array, each element is
+recursively explored. When the expression matches an object, keys are left unchanged and each value
+of the object is recursively explored.
+
+### JSON_PATH Matching Example
+
+```json
+{
+ "key1": ["a", "b", "c"],
+ "key2": {
+ "key3": [
+ {
+ "key4": ["d", "e"],
+ "key5": ["f", "g"],
+ "key6" 6
+ }
+ ]
+ }
+}
+```
+Expression | Matches
+---------------------|-----------
+`$` | a, b, c, d, e, f, g
+`$.*` | a, b, c, d, e, f, g
+`$.key1` | a, b, c
+`$.key1[0]` | a
+`$.key2` | d, e, f, g
+`$.key2.key3` | d, e, f, g
+`$.key2.key3.*.key4` | d, e
+`$.key2.key3.*.*[0]` | d, f
+
+
+
+# Media Selectors Output File
+
+When media selectors are used, the JsonOutputObject will contain a URI referencing the file
+location in the `$.media.*.mediaSelectorsOutputUri` field.
+
+The job from the [New Job Request Fields section](#new-job-request-fields) could be used with the
+document below.
+```json
+{
+ "otherStuffKey": ["other stuff value"],
+ "spanishMessages": [
+ {
+ "to": "spanish recipient 1",
+ "from": "spanish sender 1",
+ "content": "¿Hola, cómo estás?"
+ },
+ {
+ "to": "spanish recipient 2",
+ "from": "spanish sender 2",
+ "content": "¿Dónde está la biblioteca?"
+ }
+ ],
+ "chineseMessages": [
+ {
+ "to": "chinese recipient 1",
+ "from": "chinese sender 1",
+ "content": "现在是几奌?"
+ },
+ {
+ "to": "chinese recipient 2",
+ "from": "chinese sender 2",
+ "content": "你叫什么名字?"
+ },
+ {
+ "to": "chinese recipient 3",
+ "from": "chinese sender 3",
+ "content": "你在哪里?"
+ }
+ ]
+}
+```
+
+The `mediaSelectorsOutputUri` field will refer to a document containing the content below.
+```json
+{
+ "otherStuffKey": ["other stuff value"],
+ "spanishMessages": [
+ {
+ "to": "spanish recipient 1",
+ "from": "spanish sender 1",
+ "content": "Hello, how are you?"
+ },
+ {
+ "to": "spanish recipient 2",
+ "from": "spanish sender 2",
+ "content": "Where is the library?"
+ }
+ ],
+ "chineseMessages": [
+ {
+ "to": "chinese recipient 1",
+ "from": "chinese sender 1",
+ "content": "What time is it?"
+ },
+ {
+ "to": "chinese recipient 2",
+ "from": "chinese sender 2",
+ "content": "What is your name?"
+ },
+ {
+ "to": "chinese recipient 3",
+ "from": "chinese sender 3",
+ "content": "Where are you?"
+ }
+ ]
+}
+```
diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
index 9c3a88730ee1..f92f4cb74a85 100644
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@@ -30,6 +30,7 @@ pages:
- Roll Up Guide: Roll-Up-Guide.md
- Health Check Guide: Health-Check-Guide.md
- Quality Selection Guide: Quality-Selection-Guide.md
+ - Media Selectors Guide: Media-Selectors-Guide.md
- REST API: REST-API.md
- Component Development:
- Component API Overview: Component-API-Overview.md
diff --git a/docs/site/404.html b/docs/site/404.html
index 8fd80512eec5..c50b2f96254c 100644
--- a/docs/site/404.html
+++ b/docs/site/404.html
@@ -126,6 +126,10 @@
NOTICE: This software (or technical data) was produced for the U.S. Government under contract,
+and is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025
+The MITRE Corporation. All Rights Reserved.
+
Media Selectors Overview
+
Media selectors allow users to specify that only specific sections of a document should be
+processed. A copy of the input file with the specified sections replaced by component output is
+produced.
+
New Job Request Fields
+
Below is an example of a job that uses media selectors. The job uses a two stage pipeline.
+The first stage performs language identification. The second performs translation.
$.media.*.mediaSelectorsOutputAction: Name of the action that produces content for the media
+ selectors output file. In the above example, we specify that we want the translated content
+ from Argos in the media selectors output file rather than the detected language from the first
+ stage.
+
$.media.*.mediaSelectors: List of media selectors that will be used for the media.
+
$.media.*.mediaSelectors.*.type: The name of the type of media selector that is used in the
+ expression field.
+
$.media.*.mediaSelectors.*.resultDetectionProperty: A detection property name from tracks
+ produced by the mediaSelectorsOutputAction. The media selectors output document will be
+ populated with the content of the specified property.
+
$.media.*.mediaSelectors.*.selectionProperties: Job properties that will only be used for
+ sub-jobs created for a specific media selector.
+
+
New Job Properties
+
+
MEDIA_SELECTORS_DELIMETER: When not provided and a job uses media selectors, the selected parts
+ of the document will be replaced with the action output. When provided, the selected parts of
+ the document will contain the original content, followed by the value of this property, and
+ finally the action output.
+
MEDIA_SELECTORS_DUPLICATE_POLICY: Specifies how to handle the case where a job uses media
+ selectors and there are multiple outputs for a single selection. When set to LONGEST, the
+ longer of the two outputs is chosen and the shorter one is discarded. When set to ERROR,
+ duplicates are considered an error. When set to JOIN, the duplicates are combined using
+ | as a delimiter.
+
MEDIA_SELECTORS_NO_MATCHES_IS_ERROR: When true and a job uses media selectors, an error will be
+ generated when none of the selectors match content from the media.
+
+
Media Selector Types
+
JSON_PATH is only type currently supported, but others are planned.
+
JSON_PATH
+
Used to extract content for JSON files. Uses the "Jayway JsonPath" library to parse the expressions.
+The specific syntax supported is available on their
+GitHub page.
+
When extracting content from the document, only strings, arrays, and objects are considered. All
+other JSON types are ignored. When the JsonPath expression matches an array, each element is
+recursively explored. When the expression matches an object, keys are left unchanged and each value
+of the object is recursively explored.
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index a0449ad2e167..71b32d514819 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -2,12 +2,12 @@
"docs": [
{
"location": "/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract, and is subject to the\nRights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2024 The MITRE Corporation. All Rights Reserved.\n\n\nOverview\n\n\nThere are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison.\n\n\nOpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection.\n\n\nFor those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms.\n\n\nA list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here:\n\n\n\n\n\n\n\n\nOperation\n\n\nObject Type\n\n\nFramework\n\n\n\n\n\n\n\n\n\n\nDetection/Tracking\n\n\nFace\n\n\nLBP-Based OpenCV\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nMOG w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nSuBSENSE w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nLicense Plate\n\n\nOpenALPR\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nSphinx\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nAzure Cognitive Services Batch Transcription API\n\n\n\n\n\n\nDetection\n\n\nScene\n\n\nOpenCV\n\n\n\n\n\n\nDetection\n\n\nClassification\n\n\nOpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification\n\n\nOpenCV DNN (YOLO)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification/Features\n\n\nTensorRT (COCO classes)\n\n\n\n\n\n\nDetection\n\n\nText Region\n\n\nEAST\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nApache Tika\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nTesseract OCR\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nAzure Cognitive Services Computer Vision API (OCR endpoint)\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nAzure Cognitive Services Read API\n\n\n\n\n\n\nDetection\n\n\nForm Structure (with OCR)\n\n\nAzure Cognitive Services Form Recognizer API\n\n\n\n\n\n\nDetection\n\n\nKeywords\n\n\nBoost Regular Expressions\n\n\n\n\n\n\nDetection\n\n\nImage (from document)\n\n\nApache Tika\n\n\n\n\n\n\nTranslation\n\n\nLanguage\n\n\nAzure Cognitive Services Translate API\n\n\n\n\n\n\n\n\nThe OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract, and is subject to the\nRights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2024 The MITRE Corporation. All Rights Reserved.\n\n\nOverview\n\n\nThere are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison.\n\n\nOpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection.\n\n\nFor those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms.\n\n\nA list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here:\n\n\n\n\n\n\n\n\nOperation\n\n\nObject Type\n\n\nFramework\n\n\n\n\n\n\n\n\n\n\nDetection/Tracking\n\n\nFace\n\n\nLBP-Based OpenCV\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nMOG w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nSuBSENSE w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nLicense Plate\n\n\nOpenALPR\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nSphinx\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nAzure Cognitive Services Batch Transcription API\n\n\n\n\n\n\nDetection\n\n\nScene\n\n\nOpenCV\n\n\n\n\n\n\nDetection\n\n\nClassification\n\n\nOpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification\n\n\nOpenCV DNN (YOLO)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification/Features\n\n\nTensorRT (COCO classes)\n\n\n\n\n\n\nDetection\n\n\nText Region\n\n\nEAST\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nApache Tika\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nTesseract OCR\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nAzure Cognitive Services Read API\n\n\n\n\n\n\nDetection\n\n\nForm Structure (with OCR)\n\n\nAzure Cognitive Services Form Recognizer API\n\n\n\n\n\n\nDetection\n\n\nKeywords\n\n\nBoost Regular Expressions\n\n\n\n\n\n\nDetection\n\n\nImage (from document)\n\n\nApache Tika\n\n\n\n\n\n\nTranslation\n\n\nLanguage\n\n\nAzure Cognitive Services Translate API\n\n\n\n\n\n\n\n\nThe OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
"title": "Home"
},
{
"location": "/index.html#overview",
- "text": "There are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison. OpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection. For those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms. A list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here: Operation Object Type Framework Detection/Tracking Face LBP-Based OpenCV Detection/Tracking Motion MOG w/ STRUCK Detection/Tracking Motion SuBSENSE w/ STRUCK Detection/Tracking License Plate OpenALPR Detection Speech Sphinx Detection Speech Azure Cognitive Services Batch Transcription API Detection Scene OpenCV Detection Classification OpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color) Detection/Tracking Classification OpenCV DNN (YOLO) Detection/Tracking Classification/Features TensorRT (COCO classes) Detection Text Region EAST Detection Text (OCR) Apache Tika Detection Text (OCR) Tesseract OCR Detection Text (OCR) Azure Cognitive Services Computer Vision API (OCR endpoint) Detection Text (OCR) Azure Cognitive Services Read API Detection Form Structure (with OCR) Azure Cognitive Services Form Recognizer API Detection Keywords Boost Regular Expressions Detection Image (from document) Apache Tika Translation Language Azure Cognitive Services Translate API The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
+ "text": "There are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison. OpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection. For those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms. A list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here: Operation Object Type Framework Detection/Tracking Face LBP-Based OpenCV Detection/Tracking Motion MOG w/ STRUCK Detection/Tracking Motion SuBSENSE w/ STRUCK Detection/Tracking License Plate OpenALPR Detection Speech Sphinx Detection Speech Azure Cognitive Services Batch Transcription API Detection Scene OpenCV Detection Classification OpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color) Detection/Tracking Classification OpenCV DNN (YOLO) Detection/Tracking Classification/Features TensorRT (COCO classes) Detection Text Region EAST Detection Text (OCR) Apache Tika Detection Text (OCR) Tesseract OCR Detection Text (OCR) Azure Cognitive Services Read API Detection Form Structure (with OCR) Azure Cognitive Services Form Recognizer API Detection Keywords Boost Regular Expressions Detection Image (from document) Apache Tika Translation Language Azure Cognitive Services Translate API The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
"title": "Overview"
},
{
@@ -630,6 +630,46 @@
"text": "In some cases, there may be a detection property that a component would like to use as a measure of quality but it\ndoesn't lend itself to simple thresholding, perhaps because its value is not linearly increasing, or it is not numeric. The\ncomponent can in this case create a custom property that represents the quality of detections using a numerical value that\ncorresponds to the ordering of the detections from low to high quality. As a simple example, a face detector might be able to calculate the face pose and would like to select for artifact\nextraction the face that is closest to frontal pose, and the two that are closest to left and right profile pose. If the face\ndetector computes the yaw with values between -90 degrees and +90 degrees, then the numerical order of those values would\nnot produce the desired result. In this case, the component could create a custom detection property called RANK , and\nassign values to that property that orders the detections from highest to lowest quality. The face detection component would\nassign the highest value of RANK to the detection with a value of yaw closest to 0, and the detections with values of yaw\nclosest to +/-90 degrees would be assigned the second and third highest values of RANK . Detections without the RANK \nproperty would be treated as having the lowest possible quality value. Thus, the track exemplar would be the face with the\nfrontal pose, and the ARTIFACT_EXTRACTION_POLICY_TOP_QUALITY_COUNT property would be set to 3, so that the frontal and\ntwo profile pose detections would be kept as track artifacts in addition to the exemplar.",
"title": "Hybrid Quality Selection"
},
+ {
+ "location": "/Media-Selectors-Guide/index.html",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the type of media selector that is used in the\n \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
+ "title": "Media Selectors Guide"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#media-selectors-overview",
+ "text": "Media selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.",
+ "title": "Media Selectors Overview"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#new-job-request-fields",
+ "text": "Below is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector that is used in the\n expression field. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
+ "title": "New Job Request Fields"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#new-job-properties",
+ "text": "MEDIA_SELECTORS_DELIMETER : When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output. MEDIA_SELECTORS_DUPLICATE_POLICY : Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to LONGEST , the\n longer of the two outputs is chosen and the shorter one is discarded. When set to ERROR ,\n duplicates are considered an error. When set to JOIN , the duplicates are combined using\n | as a delimiter. MEDIA_SELECTORS_NO_MATCHES_IS_ERROR : When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.",
+ "title": "New Job Properties"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#media-selector-types",
+ "text": "JSON_PATH is only type currently supported, but others are planned.",
+ "title": "Media Selector Types"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#json_path",
+ "text": "Used to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their GitHub page . When extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.",
+ "title": "JSON_PATH"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#json_path-matching-example",
+ "text": "{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n} Expression Matches $ a, b, c, d, e, f, g $.* a, b, c, d, e, f, g $.key1 a, b, c $.key1[0] a $.key2 d, e, f, g $.key2.key3 d, e, f, g $.key2.key3.*.key4 d, e $.key2.key3.*.*[0] d, f",
+ "title": "JSON_PATH Matching Example"
+ },
+ {
+ "location": "/Media-Selectors-Guide/index.html#media-selectors-output-file",
+ "text": "When media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the $.media.*.mediaSelectorsOutputUri field. The job from the New Job Request Fields section could be used with the\ndocument below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n} The mediaSelectorsOutputUri field will refer to a document containing the content below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
+ "title": "Media Selectors Output File"
+ },
{
"location": "/REST-API/index.html",
"text": "The OpenMPF REST API is provided by Swagger and is available within the OpenMPF Workflow Manager web application. Swagger enables users to test the endpoints using the running instance of OpenMPF.\n\n\nClick \nhere\n for a generated version of the content.\n\n\nNote that in a Docker deployment the \n/rest/nodes\n and \n/rest/streaming\n endpoints are disabled.",
diff --git a/docs/site/sitemap.xml b/docs/site/sitemap.xml
index 81af8c35d1e8..7e6ec8bb1eed 100644
--- a/docs/site/sitemap.xml
+++ b/docs/site/sitemap.xml
@@ -2,152 +2,157 @@
/index.html
- 2024-12-06
+ 2025-01-30daily/Release-Notes/index.html
- 2024-12-06
+ 2025-01-30daily/License-And-Distribution/index.html
- 2024-12-06
+ 2025-01-30daily/Acknowledgements/index.html
- 2024-12-06
+ 2025-01-30daily/Install-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Admin-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/User-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/OpenID-Connect-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Media-Segmentation-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Feed-Forward-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Derivative-Media-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Object-Storage-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Markup-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/TiesDb-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Trigger-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Roll-Up-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Health-Check-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Quality-Selection-Guide/index.html
- 2024-12-06
+ 2025-01-30
+ daily
+
+
+ /Media-Selectors-Guide/index.html
+ 2025-01-30daily/REST-API/index.html
- 2024-12-06
+ 2025-01-30daily/Component-API-Overview/index.html
- 2024-12-06
+ 2025-01-30daily/Component-Descriptor-Reference/index.html
- 2024-12-06
+ 2025-01-30daily/CPP-Batch-Component-API/index.html
- 2024-12-06
+ 2025-01-30daily/Python-Batch-Component-API/index.html
- 2024-12-06
+ 2025-01-30daily/Java-Batch-Component-API/index.html
- 2024-12-06
+ 2025-01-30daily/GPU-Support-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Contributor-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Development-Environment-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Node-Guide/index.html
- 2024-12-06
+ 2025-01-30daily/Workflow-Manager-Architecture/index.html
- 2024-12-06
+ 2025-01-30daily/CPP-Streaming-Component-API/index.html
- 2024-12-06
+ 2025-01-30daily
\ No newline at end of file
From 4f0c6157e4e493e878cbc359272d004bcc9d2fbd Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Fri, 31 Jan 2025 12:21:48 -0500
Subject: [PATCH 2/7] Update rest api
---
docs/docs/Media-Selectors-Guide.md | 2 +
docs/docs/html/REST-API.html | 83 +++++++++++++++++++++-
docs/site/Media-Selectors-Guide/index.html | 2 +
docs/site/html/REST-API.html | 83 +++++++++++++++++++++-
docs/site/index.html | 2 +-
docs/site/search/search_index.json | 4 +-
docs/site/sitemap.xml | 62 ++++++++--------
7 files changed, 198 insertions(+), 40 deletions(-)
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
index a523ebfd5ff6..599a7e97fa06 100644
--- a/docs/docs/Media-Selectors-Guide.md
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -50,6 +50,8 @@ The first stage performs language identification. The second performs translatio
- `$.media.*.mediaSelectors`: List of media selectors that will be used for the media.
- `$.media.*.mediaSelectors.*.type`: The name of the type of media selector that is used in the
`expression` field.
+- `$.media.*.mediaSelectors.*.expression`: A string specifying the sections of the document that
+ should be processed. The `type` field specifies the syntax of the expression.
- `$.media.*.mediaSelectors.*.resultDetectionProperty`: A detection property name from tracks
produced by the `mediaSelectorsOutputAction`. The media selectors output document will be
populated with the content of the specified property.
diff --git a/docs/docs/html/REST-API.html b/docs/docs/html/REST-API.html
index 781c4aaf5e4d..7e6e1a783b2f 100644
--- a/docs/docs/html/REST-API.html
+++ b/docs/docs/html/REST-API.html
@@ -183,7 +183,7 @@
Version 9.0
-
+
Introduction
@@ -200,6 +200,7 @@
1. Definitions
JobCreationRequest
JobCreationMediaData
JobCreationMediaRange
+
JobCreationMediaSelector
TransientPipelineDefinition
TransientTask
TransientAction
@@ -513,7 +514,7 @@
JobCreationRequest
No
-
+
properties
object
A map of key-value pairs using strings which may be used to override the parameters associated with the pipeline or job properties for this medium.
@@ -529,7 +530,7 @@
JobCreationRequest
No
-
+
timeRanges
array<JobCreationMediaRange>
@@ -540,6 +541,26 @@
JobCreationRequest
No
+
+
mediaSelectorsOutputAction
+
string
+
+ Name of the action that produces content for the media selectors output file.
+ When mediaSelectorsOutputAction is provided,
+ mediaSelectors should also be provided.
+
+
No
+
+
+
mediaSelectors
+
array<JobCreationMediaSelector>
+
+ Used to specify that only specific sections of a document should be processed.
+ When mediaSelectors are present,
+ mediaSelectorsOutputAction must also be present.
+
+
No
+
@@ -572,6 +593,62 @@
JobCreationRequest
+
+
JobCreationMediaSelector
+
+
+
+
+
Name
+
Type
+
Description
+
Required
+
+
+
+
+
expression
+
string
+
+ A string specifying the sections of the document that should be processed.
+ The type field specifies the syntax of the expression.
+
+
Yes
+
+
+
type
+
string
+
+ The name of the type of media selector that is used in the
+ expression field. "JSON_PATH" is only type currently supported.
+
+
Yes
+
+
+
selectionProperties
+
object<string, string>
+
+ A map of key-value pairs using strings which may be used to override the
+ parameters associated with the pipeline or job properties for the sections of
+ the document that match the expression.
+
+
No
+
+
+
resultDetectionProperty
+
string
+
+ A detection property name from tracks produced by the
+ mediaSelectorsOutputAction. The media selectors output document
+ will be populated with the content of the specified property.
+
$.media.*.mediaSelectors: List of media selectors that will be used for the media.
$.media.*.mediaSelectors.*.type: The name of the type of media selector that is used in the
expression field.
+
$.media.*.mediaSelectors.*.expression: A string specifying the sections of the document that
+ should be processed. The type field specifies the syntax of the expression.
$.media.*.mediaSelectors.*.resultDetectionProperty: A detection property name from tracks
produced by the mediaSelectorsOutputAction. The media selectors output document will be
populated with the content of the specified property.
A map of key-value pairs using strings which may be used to override the parameters associated with the pipeline or job properties for this medium.
@@ -529,7 +530,7 @@
JobCreationRequest
No
-
+
timeRanges
array<JobCreationMediaRange>
@@ -540,6 +541,26 @@
JobCreationRequest
No
+
+
mediaSelectorsOutputAction
+
string
+
+ Name of the action that produces content for the media selectors output file.
+ When mediaSelectorsOutputAction is provided,
+ mediaSelectors should also be provided.
+
+
No
+
+
+
mediaSelectors
+
array<JobCreationMediaSelector>
+
+ Used to specify that only specific sections of a document should be processed.
+ When mediaSelectors are present,
+ mediaSelectorsOutputAction must also be present.
+
+
No
+
@@ -572,6 +593,62 @@
JobCreationRequest
+
+
JobCreationMediaSelector
+
+
+
+
+
Name
+
Type
+
Description
+
Required
+
+
+
+
+
expression
+
string
+
+ A string specifying the sections of the document that should be processed.
+ The type field specifies the syntax of the expression.
+
+
Yes
+
+
+
type
+
string
+
+ The name of the type of media selector that is used in the
+ expression field. "JSON_PATH" is only type currently supported.
+
+
Yes
+
+
+
selectionProperties
+
object<string, string>
+
+ A map of key-value pairs using strings which may be used to override the
+ parameters associated with the pipeline or job properties for the sections of
+ the document that match the expression.
+
+
No
+
+
+
resultDetectionProperty
+
string
+
+ A detection property name from tracks produced by the
+ mediaSelectorsOutputAction. The media selectors output document
+ will be populated with the content of the specified property.
+
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index 71b32d514819..00f2ca29cfe5 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -632,7 +632,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the type of media selector that is used in the\n \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the type of media selector that is used in the\n \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Guide"
},
{
@@ -642,7 +642,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html#new-job-request-fields",
- "text": "Below is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector that is used in the\n expression field. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
+ "text": "Below is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector that is used in the\n expression field. $.media.*.mediaSelectors.*.expression : A string specifying the sections of the document that\n should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
"title": "New Job Request Fields"
},
{
diff --git a/docs/site/sitemap.xml b/docs/site/sitemap.xml
index 7e6ec8bb1eed..cfeb1eb02960 100644
--- a/docs/site/sitemap.xml
+++ b/docs/site/sitemap.xml
@@ -2,157 +2,157 @@
/index.html
- 2025-01-30
+ 2025-01-31daily/Release-Notes/index.html
- 2025-01-30
+ 2025-01-31daily/License-And-Distribution/index.html
- 2025-01-30
+ 2025-01-31daily/Acknowledgements/index.html
- 2025-01-30
+ 2025-01-31daily/Install-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Admin-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/User-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/OpenID-Connect-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Media-Segmentation-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Feed-Forward-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Derivative-Media-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Object-Storage-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Markup-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/TiesDb-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Trigger-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Roll-Up-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Health-Check-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Quality-Selection-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Media-Selectors-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/REST-API/index.html
- 2025-01-30
+ 2025-01-31daily/Component-API-Overview/index.html
- 2025-01-30
+ 2025-01-31daily/Component-Descriptor-Reference/index.html
- 2025-01-30
+ 2025-01-31daily/CPP-Batch-Component-API/index.html
- 2025-01-30
+ 2025-01-31daily/Python-Batch-Component-API/index.html
- 2025-01-30
+ 2025-01-31daily/Java-Batch-Component-API/index.html
- 2025-01-30
+ 2025-01-31daily/GPU-Support-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Contributor-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Development-Environment-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Node-Guide/index.html
- 2025-01-30
+ 2025-01-31daily/Workflow-Manager-Architecture/index.html
- 2025-01-30
+ 2025-01-31daily/CPP-Streaming-Component-API/index.html
- 2025-01-30
+ 2025-01-31daily
\ No newline at end of file
From 5b7bb719362e4f84039328af934a1014dfeaeff0 Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Mon, 3 Feb 2025 10:54:05 -0500
Subject: [PATCH 3/7] Update docs
---
docs/docs/Media-Selectors-Guide.md | 46 ++++++++++++++--
docs/site/Media-Selectors-Guide/index.html | 43 +++++++++++++--
docs/site/index.html | 2 +-
docs/site/search/search_index.json | 8 +--
docs/site/sitemap.xml | 62 +++++++++++-----------
5 files changed, 117 insertions(+), 44 deletions(-)
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
index 599a7e97fa06..b545858f6573 100644
--- a/docs/docs/Media-Selectors-Guide.md
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -11,7 +11,7 @@ produced.
# New Job Request Fields
-Below is an example of a job that uses media selectors. The job uses a two stage pipeline.
+Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.
The first stage performs language identification. The second performs translation.
```json
{
@@ -48,8 +48,8 @@ The first stage performs language identification. The second performs translatio
from Argos in the media selectors output file rather than the detected language from the first
stage.
- `$.media.*.mediaSelectors`: List of media selectors that will be used for the media.
-- `$.media.*.mediaSelectors.*.type`: The name of the type of media selector that is used in the
- `expression` field.
+- `$.media.*.mediaSelectors.*.type`: The name of the [type of media selector](#media-selector-types)
+ that is used in the `expression` field.
- `$.media.*.mediaSelectors.*.expression`: A string specifying the sections of the document that
should be processed. The `type` field specifies the syntax of the expression.
- `$.media.*.mediaSelectors.*.resultDetectionProperty`: A detection property name from tracks
@@ -99,7 +99,7 @@ of the object is recursively explored.
{
"key4": ["d", "e"],
"key5": ["f", "g"],
- "key6" 6
+ "key6": 6
}
]
}
@@ -195,3 +195,41 @@ The `mediaSelectorsOutputUri` field will refer to a document containing the cont
]
}
```
+
+If `MEDIA_SELECTORS_DELIMETER` was set to " | Translation: ", the file would contain the content
+below.
+
+```json
+{
+ "otherStuffKey": ["other stuff value"],
+ "spanishMessages": [
+ {
+ "to": "spanish recipient 1",
+ "from": "spanish sender 1",
+ "content": "¿Hola, cómo estás? | Translation: Hello, how are you?"
+ },
+ {
+ "to": "spanish recipient 2",
+ "from": "spanish sender 2",
+ "content": "¿Dónde está la biblioteca? | Translation: Where is the library?"
+ }
+ ],
+ "chineseMessages": [
+ {
+ "to": "chinese recipient 1",
+ "from": "chinese sender 1",
+ "content": "现在是几奌? | Translation: What time is it?"
+ },
+ {
+ "to": "chinese recipient 2",
+ "from": "chinese sender 2",
+ "content": "你叫什么名字? | Translation: What is your name?"
+ },
+ {
+ "to": "chinese recipient 3",
+ "from": "chinese sender 3",
+ "content": "你在哪里? | Translation: Where are you?"
+ }
+ ]
+}
+```
diff --git a/docs/site/Media-Selectors-Guide/index.html b/docs/site/Media-Selectors-Guide/index.html
index 934dee7de15e..d34218bbbae9 100644
--- a/docs/site/Media-Selectors-Guide/index.html
+++ b/docs/site/Media-Selectors-Guide/index.html
@@ -268,7 +268,7 @@
Media Selectors Overview
processed. A copy of the input file with the specified sections replaced by component output is
produced.
New Job Request Fields
-
Below is an example of a job that uses media selectors. The job uses a two stage pipeline.
+
Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.
The first stage performs language identification. The second performs translation.
{
"algorithmProperties": {},
@@ -305,8 +305,8 @@
New Job Request Fields
from Argos in the media selectors output file rather than the detected language from the first
stage.
$.media.*.mediaSelectors: List of media selectors that will be used for the media.
-
$.media.*.mediaSelectors.*.type: The name of the type of media selector that is used in the
- expression field.
+
$.media.*.mediaSelectors.*.type: The name of the type of media selector
+ that is used in the expression field.
$.media.*.mediaSelectors.*.expression: A string specifying the sections of the document that
should be processed. The type field specifies the syntax of the expression.
$.media.*.mediaSelectors.*.resultDetectionProperty: A detection property name from tracks
@@ -347,7 +347,7 @@
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index 00f2ca29cfe5..753c9e2ba5f3 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -632,7 +632,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the type of media selector that is used in the\n \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Guide"
},
{
@@ -642,7 +642,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html#new-job-request-fields",
- "text": "Below is an example of a job that uses media selectors. The job uses a two stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector that is used in the\n expression field. $.media.*.mediaSelectors.*.expression : A string specifying the sections of the document that\n should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
+ "text": "Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector \n that is used in the expression field. $.media.*.mediaSelectors.*.expression : A string specifying the sections of the document that\n should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
"title": "New Job Request Fields"
},
{
@@ -662,12 +662,12 @@
},
{
"location": "/Media-Selectors-Guide/index.html#json_path-matching-example",
- "text": "{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\" 6\n }\n ]\n }\n} Expression Matches $ a, b, c, d, e, f, g $.* a, b, c, d, e, f, g $.key1 a, b, c $.key1[0] a $.key2 d, e, f, g $.key2.key3 d, e, f, g $.key2.key3.*.key4 d, e $.key2.key3.*.*[0] d, f",
+ "text": "{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n} Expression Matches $ a, b, c, d, e, f, g $.* a, b, c, d, e, f, g $.key1 a, b, c $.key1[0] a $.key2 d, e, f, g $.key2.key3 d, e, f, g $.key2.key3.*.key4 d, e $.key2.key3.*.*[0] d, f",
"title": "JSON_PATH Matching Example"
},
{
"location": "/Media-Selectors-Guide/index.html#media-selectors-output-file",
- "text": "When media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the $.media.*.mediaSelectorsOutputUri field. The job from the New Job Request Fields section could be used with the\ndocument below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n} The mediaSelectorsOutputUri field will refer to a document containing the content below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}",
+ "text": "When media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the $.media.*.mediaSelectorsOutputUri field. The job from the New Job Request Fields section could be used with the\ndocument below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n} The mediaSelectorsOutputUri field will refer to a document containing the content below. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n} If MEDIA_SELECTORS_DELIMETER was set to \" | Translation: \", the file would contain the content\nbelow. {\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Output File"
},
{
diff --git a/docs/site/sitemap.xml b/docs/site/sitemap.xml
index cfeb1eb02960..43971170073b 100644
--- a/docs/site/sitemap.xml
+++ b/docs/site/sitemap.xml
@@ -2,157 +2,157 @@
/index.html
- 2025-01-31
+ 2025-02-03daily/Release-Notes/index.html
- 2025-01-31
+ 2025-02-03daily/License-And-Distribution/index.html
- 2025-01-31
+ 2025-02-03daily/Acknowledgements/index.html
- 2025-01-31
+ 2025-02-03daily/Install-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Admin-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/User-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/OpenID-Connect-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Media-Segmentation-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Feed-Forward-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Derivative-Media-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Object-Storage-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Markup-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/TiesDb-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Trigger-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Roll-Up-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Health-Check-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Quality-Selection-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Media-Selectors-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/REST-API/index.html
- 2025-01-31
+ 2025-02-03daily/Component-API-Overview/index.html
- 2025-01-31
+ 2025-02-03daily/Component-Descriptor-Reference/index.html
- 2025-01-31
+ 2025-02-03daily/CPP-Batch-Component-API/index.html
- 2025-01-31
+ 2025-02-03daily/Python-Batch-Component-API/index.html
- 2025-01-31
+ 2025-02-03daily/Java-Batch-Component-API/index.html
- 2025-01-31
+ 2025-02-03daily/GPU-Support-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Contributor-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Development-Environment-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Node-Guide/index.html
- 2025-01-31
+ 2025-02-03daily/Workflow-Manager-Architecture/index.html
- 2025-01-31
+ 2025-02-03daily/CPP-Streaming-Component-API/index.html
- 2025-01-31
+ 2025-02-03daily
\ No newline at end of file
From 67dd52d04dec2ebf3a0f93c4e50fc632489a4f2e Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Tue, 4 Feb 2025 13:06:55 -0500
Subject: [PATCH 4/7] update docs
---
docs/docs/Media-Selectors-Guide.md | 3 +-
docs/site/Media-Selectors-Guide/index.html | 3 +-
docs/site/index.html | 2 +-
docs/site/search/search_index.json | 4 +-
docs/site/sitemap.xml | 62 +++++++++++-----------
5 files changed, 38 insertions(+), 36 deletions(-)
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
index b545858f6573..2b43685a3c85 100644
--- a/docs/docs/Media-Selectors-Guide.md
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -82,7 +82,8 @@ The first stage performs language identification. The second performs translatio
Used to extract content for JSON files. Uses the "Jayway JsonPath" library to parse the expressions.
The specific syntax supported is available on their
-[GitHub page](https://github.com/json-path/JsonPath?tab=readme-ov-file#operators).
+[GitHub page](https://github.com/json-path/JsonPath?tab=readme-ov-file#operators). JsonPath
+expressions are case-sensitive.
When extracting content from the document, only strings, arrays, and objects are considered. All
other JSON types are ignored. When the JsonPath expression matches an array, each element is
diff --git a/docs/site/Media-Selectors-Guide/index.html b/docs/site/Media-Selectors-Guide/index.html
index d34218bbbae9..d198bb34a26b 100644
--- a/docs/site/Media-Selectors-Guide/index.html
+++ b/docs/site/Media-Selectors-Guide/index.html
@@ -334,7 +334,8 @@
Media Selector Types
JSON_PATH
Used to extract content for JSON files. Uses the "Jayway JsonPath" library to parse the expressions.
The specific syntax supported is available on their
-GitHub page.
+GitHub page. JsonPath
+expressions are case-sensitive.
When extracting content from the document, only strings, arrays, and objects are considered. All
other JSON types are ignored. When the JsonPath expression matches an array, each element is
recursively explored. When the expression matches an object, keys are left unchanged and each value
diff --git a/docs/site/index.html b/docs/site/index.html
index 96b533193089..f8cf0ed70ebc 100644
--- a/docs/site/index.html
+++ b/docs/site/index.html
@@ -399,5 +399,5 @@
Overview
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index 753c9e2ba5f3..ccb6bfe2b667 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -632,7 +632,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n. JsonPath\nexpressions are case-sensitive.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Guide"
},
{
@@ -657,7 +657,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html#json_path",
- "text": "Used to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their GitHub page . When extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.",
+ "text": "Used to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their GitHub page . JsonPath\nexpressions are case-sensitive. When extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.",
"title": "JSON_PATH"
},
{
diff --git a/docs/site/sitemap.xml b/docs/site/sitemap.xml
index 43971170073b..8872edcefcd6 100644
--- a/docs/site/sitemap.xml
+++ b/docs/site/sitemap.xml
@@ -2,157 +2,157 @@
/index.html
- 2025-02-03
+ 2025-02-04daily/Release-Notes/index.html
- 2025-02-03
+ 2025-02-04daily/License-And-Distribution/index.html
- 2025-02-03
+ 2025-02-04daily/Acknowledgements/index.html
- 2025-02-03
+ 2025-02-04daily/Install-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Admin-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/User-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/OpenID-Connect-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Media-Segmentation-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Feed-Forward-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Derivative-Media-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Object-Storage-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Markup-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/TiesDb-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Trigger-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Roll-Up-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Health-Check-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Quality-Selection-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Media-Selectors-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/REST-API/index.html
- 2025-02-03
+ 2025-02-04daily/Component-API-Overview/index.html
- 2025-02-03
+ 2025-02-04daily/Component-Descriptor-Reference/index.html
- 2025-02-03
+ 2025-02-04daily/CPP-Batch-Component-API/index.html
- 2025-02-03
+ 2025-02-04daily/Python-Batch-Component-API/index.html
- 2025-02-03
+ 2025-02-04daily/Java-Batch-Component-API/index.html
- 2025-02-03
+ 2025-02-04daily/GPU-Support-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Contributor-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Development-Environment-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Node-Guide/index.html
- 2025-02-03
+ 2025-02-04daily/Workflow-Manager-Architecture/index.html
- 2025-02-03
+ 2025-02-04daily/CPP-Streaming-Component-API/index.html
- 2025-02-03
+ 2025-02-04daily
\ No newline at end of file
From 8ff99d0be6fb115c70d1b64f5e8b1750b4c7cc1a Mon Sep 17 00:00:00 2001
From: jrobble
Date: Tue, 4 Feb 2025 16:08:59 -0500
Subject: [PATCH 5/7] Mention case-sensitive and add example.
---
docs/docs/Media-Selectors-Guide.md | 10 +++++++---
docs/site/Media-Selectors-Guide/index.html | 9 ++++++---
docs/site/index.html | 2 +-
docs/site/search/search_index.json | 4 ++--
4 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
index 2b43685a3c85..1d4e522004af 100644
--- a/docs/docs/Media-Selectors-Guide.md
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -50,16 +50,20 @@ The first stage performs language identification. The second performs translatio
- `$.media.*.mediaSelectors`: List of media selectors that will be used for the media.
- `$.media.*.mediaSelectors.*.type`: The name of the [type of media selector](#media-selector-types)
that is used in the `expression` field.
-- `$.media.*.mediaSelectors.*.expression`: A string specifying the sections of the document that
- should be processed. The `type` field specifies the syntax of the expression.
+- `$.media.*.mediaSelectors.*.expression`: A case-sensitive string specifying the sections of the
+ document that should be processed. The `type` field specifies the syntax of the expression.
- `$.media.*.mediaSelectors.*.resultDetectionProperty`: A detection property name from tracks
produced by the `mediaSelectorsOutputAction`. The media selectors output document will be
populated with the content of the specified property.
- `$.media.*.mediaSelectors.*.selectionProperties`: Job properties that will only be used for
- sub-jobs created for a specific media selector.
+ sub-jobs created for a specific media selector. For example, when performing Argos translation
+ on a JSON file in a single-stage pipeline without an upstream language detection stage, this
+ could set `DEFAULT_SOURCE_LANGUAGE=es` for some media selectors and
+ `DEFAULT_SOURCE_LANGUAGE=zh` for others.
# New Job Properties
+
- `MEDIA_SELECTORS_DELIMETER`: When not provided and a job uses media selectors, the selected parts
of the document will be replaced with the action output. When provided, the selected parts of
the document will contain the original content, followed by the value of this property, and
diff --git a/docs/site/Media-Selectors-Guide/index.html b/docs/site/Media-Selectors-Guide/index.html
index d198bb34a26b..7cc88f1be5de 100644
--- a/docs/site/Media-Selectors-Guide/index.html
+++ b/docs/site/Media-Selectors-Guide/index.html
@@ -307,13 +307,16 @@
New Job Request Fields
$.media.*.mediaSelectors: List of media selectors that will be used for the media.
$.media.*.mediaSelectors.*.type: The name of the type of media selector
that is used in the expression field.
-
$.media.*.mediaSelectors.*.expression: A string specifying the sections of the document that
- should be processed. The type field specifies the syntax of the expression.
+
$.media.*.mediaSelectors.*.expression: A case-sensitive string specifying the sections of the
+ document that should be processed. The type field specifies the syntax of the expression.
$.media.*.mediaSelectors.*.resultDetectionProperty: A detection property name from tracks
produced by the mediaSelectorsOutputAction. The media selectors output document will be
populated with the content of the specified property.
$.media.*.mediaSelectors.*.selectionProperties: Job properties that will only be used for
- sub-jobs created for a specific media selector.
+ sub-jobs created for a specific media selector. For example, when performing Argos translation
+ on a JSON file in a single-stage pipeline without an upstream language detection stage, this
+ could set DEFAULT_SOURCE_LANGUAGE=es for some media selectors and
+ DEFAULT_SOURCE_LANGUAGE=zh for others.
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index ccb6bfe2b667..e5dfa8d27c1d 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -632,7 +632,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n. JsonPath\nexpressions are case-sensitive.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A case-sensitive string specifying the sections of the\n document that should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set \nDEFAULT_SOURCE_LANGUAGE=es\n for some media selectors and\n \nDEFAULT_SOURCE_LANGUAGE=zh\n for others.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n. JsonPath\nexpressions are case-sensitive.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Guide"
},
{
@@ -642,7 +642,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html#new-job-request-fields",
- "text": "Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector \n that is used in the expression field. $.media.*.mediaSelectors.*.expression : A string specifying the sections of the document that\n should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector.",
+ "text": "Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector \n that is used in the expression field. $.media.*.mediaSelectors.*.expression : A case-sensitive string specifying the sections of the\n document that should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set DEFAULT_SOURCE_LANGUAGE=es for some media selectors and\n DEFAULT_SOURCE_LANGUAGE=zh for others.",
"title": "New Job Request Fields"
},
{
From 160ae592f07fa89c0306e082bf365b888571b57c Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Wed, 5 Feb 2025 09:59:47 -0500
Subject: [PATCH 6/7] Revert change to mediaSelectors.*.expression
---
docs/docs/Media-Selectors-Guide.md | 4 +-
docs/site/Media-Selectors-Guide/index.html | 4 +-
docs/site/index.html | 2 +-
docs/site/search/search_index.json | 4 +-
docs/site/sitemap.xml | 62 +++++++++++-----------
5 files changed, 38 insertions(+), 38 deletions(-)
diff --git a/docs/docs/Media-Selectors-Guide.md b/docs/docs/Media-Selectors-Guide.md
index 1d4e522004af..fd130a4c061a 100644
--- a/docs/docs/Media-Selectors-Guide.md
+++ b/docs/docs/Media-Selectors-Guide.md
@@ -50,8 +50,8 @@ The first stage performs language identification. The second performs translatio
- `$.media.*.mediaSelectors`: List of media selectors that will be used for the media.
- `$.media.*.mediaSelectors.*.type`: The name of the [type of media selector](#media-selector-types)
that is used in the `expression` field.
-- `$.media.*.mediaSelectors.*.expression`: A case-sensitive string specifying the sections of the
- document that should be processed. The `type` field specifies the syntax of the expression.
+- `$.media.*.mediaSelectors.*.expression`: A string specifying the sections of the document that
+ should be processed. The `type` field specifies the syntax of the expression.
- `$.media.*.mediaSelectors.*.resultDetectionProperty`: A detection property name from tracks
produced by the `mediaSelectorsOutputAction`. The media selectors output document will be
populated with the content of the specified property.
diff --git a/docs/site/Media-Selectors-Guide/index.html b/docs/site/Media-Selectors-Guide/index.html
index 7cc88f1be5de..dd00e637e1f6 100644
--- a/docs/site/Media-Selectors-Guide/index.html
+++ b/docs/site/Media-Selectors-Guide/index.html
@@ -307,8 +307,8 @@
New Job Request Fields
$.media.*.mediaSelectors: List of media selectors that will be used for the media.
$.media.*.mediaSelectors.*.type: The name of the type of media selector
that is used in the expression field.
-
$.media.*.mediaSelectors.*.expression: A case-sensitive string specifying the sections of the
- document that should be processed. The type field specifies the syntax of the expression.
+
$.media.*.mediaSelectors.*.expression: A string specifying the sections of the document that
+ should be processed. The type field specifies the syntax of the expression.
$.media.*.mediaSelectors.*.resultDetectionProperty: A detection property name from tracks
produced by the mediaSelectorsOutputAction. The media selectors output document will be
populated with the content of the specified property.
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index e5dfa8d27c1d..e1a2655d07d3 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -632,7 +632,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A case-sensitive string specifying the sections of the\n document that should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set \nDEFAULT_SOURCE_LANGUAGE=es\n for some media selectors and\n \nDEFAULT_SOURCE_LANGUAGE=zh\n for others.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n. JsonPath\nexpressions are case-sensitive.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract,\nand is subject to the Rights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2025\nThe MITRE Corporation. All Rights Reserved.\n\n\nMedia Selectors Overview\n\n\nMedia selectors allow users to specify that only specific sections of a document should be\nprocessed. A copy of the input file with the specified sections replaced by component output is\nproduced.\n\n\nNew Job Request Fields\n\n\nBelow is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation.\n\n\n{\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n}\n\n\n\n\n\n$.media.*.mediaSelectorsOutputAction\n: Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage.\n\n\n$.media.*.mediaSelectors\n: List of media selectors that will be used for the media.\n\n\n$.media.*.mediaSelectors.*.type\n: The name of the \ntype of media selector\n\n that is used in the \nexpression\n field.\n\n\n$.media.*.mediaSelectors.*.expression\n: A string specifying the sections of the document that\n should be processed. The \ntype\n field specifies the syntax of the expression.\n\n\n$.media.*.mediaSelectors.*.resultDetectionProperty\n: A detection property name from tracks\n produced by the \nmediaSelectorsOutputAction\n. The media selectors output document will be\n populated with the content of the specified property.\n\n\n$.media.*.mediaSelectors.*.selectionProperties\n: Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set \nDEFAULT_SOURCE_LANGUAGE=es\n for some media selectors and\n \nDEFAULT_SOURCE_LANGUAGE=zh\n for others.\n\n\n\n\nNew Job Properties\n\n\n\n\nMEDIA_SELECTORS_DELIMETER\n: When not provided and a job uses media selectors, the selected parts\n of the document will be replaced with the action output. When provided, the selected parts of\n the document will contain the original content, followed by the value of this property, and\n finally the action output.\n\n\nMEDIA_SELECTORS_DUPLICATE_POLICY\n: Specifies how to handle the case where a job uses media\n selectors and there are multiple outputs for a single selection. When set to \nLONGEST\n, the\n longer of the two outputs is chosen and the shorter one is discarded. When set to \nERROR\n,\n duplicates are considered an error. When set to \nJOIN\n, the duplicates are combined using\n \n|\n as a delimiter.\n\n\nMEDIA_SELECTORS_NO_MATCHES_IS_ERROR\n: When true and a job uses media selectors, an error will be\n generated when none of the selectors match content from the media.\n\n\n\n\nMedia Selector Types\n\n\nJSON_PATH\n is only type currently supported, but others are planned.\n\n\nJSON_PATH\n\n\nUsed to extract content for JSON files. Uses the \"Jayway JsonPath\" library to parse the expressions.\nThe specific syntax supported is available on their\n\nGitHub page\n. JsonPath\nexpressions are case-sensitive.\n\n\nWhen extracting content from the document, only strings, arrays, and objects are considered. All\nother JSON types are ignored. When the JsonPath expression matches an array, each element is\nrecursively explored. When the expression matches an object, keys are left unchanged and each value\nof the object is recursively explored.\n\n\nJSON_PATH Matching Example\n\n\n{\n \"key1\": [\"a\", \"b\", \"c\"],\n \"key2\": {\n \"key3\": [\n {\n \"key4\": [\"d\", \"e\"],\n \"key5\": [\"f\", \"g\"],\n \"key6\": 6\n }\n ]\n }\n}\n\n\n\n\n\n\n\n\n\nExpression\n\n\nMatches\n\n\n\n\n\n\n\n\n\n\n$\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.*\n\n\na, b, c, d, e, f, g\n\n\n\n\n\n\n$.key1\n\n\na, b, c\n\n\n\n\n\n\n$.key1[0]\n\n\na\n\n\n\n\n\n\n$.key2\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3\n\n\nd, e, f, g\n\n\n\n\n\n\n$.key2.key3.*.key4\n\n\nd, e\n\n\n\n\n\n\n$.key2.key3.*.*[0]\n\n\nd, f\n\n\n\n\n\n\n\n\nMedia Selectors Output File\n\n\nWhen media selectors are used, the JsonOutputObject will contain a URI referencing the file\nlocation in the \n$.media.*.mediaSelectorsOutputUri\n field.\n\n\nThe job from the \nNew Job Request Fields section\n could be used with the\ndocument below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f\"\n }\n ]\n}\n\n\n\nThe \nmediaSelectorsOutputUri\n field will refer to a document containing the content below.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"Where are you?\"\n }\n ]\n}\n\n\n\nIf \nMEDIA_SELECTORS_DELIMETER\n was set to \" | Translation: \", the file would contain the content\nbelow.\n\n\n{\n \"otherStuffKey\": [\"other stuff value\"],\n \"spanishMessages\": [\n {\n \"to\": \"spanish recipient 1\",\n \"from\": \"spanish sender 1\",\n \"content\": \"\u00bfHola, c\u00f3mo est\u00e1s? | Translation: Hello, how are you?\"\n },\n {\n \"to\": \"spanish recipient 2\",\n \"from\": \"spanish sender 2\",\n \"content\": \"\u00bfD\u00f3nde est\u00e1 la biblioteca? | Translation: Where is the library?\"\n }\n ],\n \"chineseMessages\": [\n {\n \"to\": \"chinese recipient 1\",\n \"from\": \"chinese sender 1\",\n \"content\": \"\u73b0\u5728\u662f\u51e0\u594c\uff1f | Translation: What time is it?\"\n },\n {\n \"to\": \"chinese recipient 2\",\n \"from\": \"chinese sender 2\",\n \"content\": \"\u4f60\u53eb\u4ec0\u4e48\u540d\u5b57\uff1f | Translation: What is your name?\"\n },\n {\n \"to\": \"chinese recipient 3\",\n \"from\": \"chinese sender 3\",\n \"content\": \"\u4f60\u5728\u54ea\u91cc\uff1f | Translation: Where are you?\"\n }\n ]\n}",
"title": "Media Selectors Guide"
},
{
@@ -642,7 +642,7 @@
},
{
"location": "/Media-Selectors-Guide/index.html#new-job-request-fields",
- "text": "Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector \n that is used in the expression field. $.media.*.mediaSelectors.*.expression : A case-sensitive string specifying the sections of the\n document that should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set DEFAULT_SOURCE_LANGUAGE=es for some media selectors and\n DEFAULT_SOURCE_LANGUAGE=zh for others.",
+ "text": "Below is an example of a job that uses media selectors. The job uses a two-stage pipeline.\nThe first stage performs language identification. The second performs translation. {\n \"algorithmProperties\": {},\n \"buildOutput\": true,\n \"jobProperties\": {},\n \"media\": [\n {\n \"mediaUri\": \"file:///opt/mpf/share/remote-media/test-json-path-translation.json\",\n \"properties\": {},\n \"mediaSelectorsOutputAction\": \"ARGOS TRANSLATION (WITH FF REGION AND NO TASK MERGING) ACTION\",\n \"mediaSelectors\": [\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.spanishMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n },\n {\n \"type\": \"JSON_PATH\",\n \"expression\": \"$.chineseMessages.*.content\",\n \"resultDetectionProperty\": \"TRANSLATION\",\n \"selectionProperties\": {}\n }\n ]\n }\n ],\n \"pipelineName\": \"ARGOS TRANSLATION (WITH FASTTEXT LANGUAGE ID) TEXT FILE PIPELINE\",\n \"priority\": 4\n} $.media.*.mediaSelectorsOutputAction : Name of the action that produces content for the media\n selectors output file. In the above example, we specify that we want the translated content\n from Argos in the media selectors output file rather than the detected language from the first\n stage. $.media.*.mediaSelectors : List of media selectors that will be used for the media. $.media.*.mediaSelectors.*.type : The name of the type of media selector \n that is used in the expression field. $.media.*.mediaSelectors.*.expression : A string specifying the sections of the document that\n should be processed. The type field specifies the syntax of the expression. $.media.*.mediaSelectors.*.resultDetectionProperty : A detection property name from tracks\n produced by the mediaSelectorsOutputAction . The media selectors output document will be\n populated with the content of the specified property. $.media.*.mediaSelectors.*.selectionProperties : Job properties that will only be used for\n sub-jobs created for a specific media selector. For example, when performing Argos translation\n on a JSON file in a single-stage pipeline without an upstream language detection stage, this\n could set DEFAULT_SOURCE_LANGUAGE=es for some media selectors and\n DEFAULT_SOURCE_LANGUAGE=zh for others.",
"title": "New Job Request Fields"
},
{
diff --git a/docs/site/sitemap.xml b/docs/site/sitemap.xml
index 8872edcefcd6..1dbf1c1771d8 100644
--- a/docs/site/sitemap.xml
+++ b/docs/site/sitemap.xml
@@ -2,157 +2,157 @@
/index.html
- 2025-02-04
+ 2025-02-05daily/Release-Notes/index.html
- 2025-02-04
+ 2025-02-05daily/License-And-Distribution/index.html
- 2025-02-04
+ 2025-02-05daily/Acknowledgements/index.html
- 2025-02-04
+ 2025-02-05daily/Install-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Admin-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/User-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/OpenID-Connect-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Media-Segmentation-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Feed-Forward-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Derivative-Media-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Object-Storage-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Markup-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/TiesDb-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Trigger-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Roll-Up-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Health-Check-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Quality-Selection-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Media-Selectors-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/REST-API/index.html
- 2025-02-04
+ 2025-02-05daily/Component-API-Overview/index.html
- 2025-02-04
+ 2025-02-05daily/Component-Descriptor-Reference/index.html
- 2025-02-04
+ 2025-02-05daily/CPP-Batch-Component-API/index.html
- 2025-02-04
+ 2025-02-05daily/Python-Batch-Component-API/index.html
- 2025-02-04
+ 2025-02-05daily/Java-Batch-Component-API/index.html
- 2025-02-04
+ 2025-02-05daily/GPU-Support-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Contributor-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Development-Environment-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Node-Guide/index.html
- 2025-02-04
+ 2025-02-05daily/Workflow-Manager-Architecture/index.html
- 2025-02-04
+ 2025-02-05daily/CPP-Streaming-Component-API/index.html
- 2025-02-04
+ 2025-02-05daily
\ No newline at end of file
From 12e8cd99649863977c56314d9ff644702559a576 Mon Sep 17 00:00:00 2001
From: Brian Rosenberg
Date: Fri, 14 Feb 2025 07:23:46 -0500
Subject: [PATCH 7/7] Add fastText
---
docs/docs/index.md | 1 +
docs/site/index.html | 7 ++++++-
docs/site/search/search_index.json | 4 ++--
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/docs/docs/index.md b/docs/docs/index.md
index aad4dea55e5f..1afdbab03c17 100644
--- a/docs/docs/index.md
+++ b/docs/docs/index.md
@@ -31,5 +31,6 @@ A list of algorithms currently integrated into the OpenMPF as distributed proces
| Detection | Keywords | Boost Regular Expressions
| Detection | Image (from document) | Apache Tika
| Translation | Language | Azure Cognitive Services Translate API
+| Detection | Language | fastText with the GlotLID model
The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.
diff --git a/docs/site/index.html b/docs/site/index.html
index b43069557ef5..beb07abe30d9 100644
--- a/docs/site/index.html
+++ b/docs/site/index.html
@@ -344,6 +344,11 @@
Overview
Language
Azure Cognitive Services Translate API
+
+
Detection
+
Language
+
fastText with the GlotLID model
+
The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.
@@ -399,5 +404,5 @@
Overview
diff --git a/docs/site/search/search_index.json b/docs/site/search/search_index.json
index 856953cbc0cb..363657a36e3f 100644
--- a/docs/site/search/search_index.json
+++ b/docs/site/search/search_index.json
@@ -2,12 +2,12 @@
"docs": [
{
"location": "/index.html",
- "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract, and is subject to the\nRights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2024 The MITRE Corporation. All Rights Reserved.\n\n\nOverview\n\n\nThere are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison.\n\n\nOpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection.\n\n\nFor those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms.\n\n\nA list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here:\n\n\n\n\n\n\n\n\nOperation\n\n\nObject Type\n\n\nFramework\n\n\n\n\n\n\n\n\n\n\nDetection/Tracking\n\n\nFace\n\n\nLBP-Based OpenCV\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nMOG w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nSuBSENSE w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nLicense Plate\n\n\nOpenALPR\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nSphinx\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nAzure Cognitive Services Batch Transcription API\n\n\n\n\n\n\nDetection\n\n\nScene\n\n\nOpenCV\n\n\n\n\n\n\nDetection\n\n\nClassification\n\n\nOpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification\n\n\nOpenCV DNN (YOLO)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification/Features\n\n\nTensorRT (COCO classes)\n\n\n\n\n\n\nDetection\n\n\nText Region\n\n\nEAST\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nApache Tika\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nTesseract OCR\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nAzure Cognitive Services Read API\n\n\n\n\n\n\nDetection\n\n\nForm Structure (with OCR)\n\n\nAzure Cognitive Services Form Recognizer API\n\n\n\n\n\n\nDetection\n\n\nKeywords\n\n\nBoost Regular Expressions\n\n\n\n\n\n\nDetection\n\n\nImage (from document)\n\n\nApache Tika\n\n\n\n\n\n\nTranslation\n\n\nLanguage\n\n\nAzure Cognitive Services Translate API\n\n\n\n\n\n\n\n\nThe OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
+ "text": "NOTICE:\n This software (or technical data) was produced for the U.S. Government under contract, and is subject to the\nRights in Data-General Clause 52.227-14, Alt. IV (DEC 2007). Copyright 2024 The MITRE Corporation. All Rights Reserved.\n\n\nOverview\n\n\nThere are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison.\n\n\nOpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection.\n\n\nFor those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms.\n\n\nA list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here:\n\n\n\n\n\n\n\n\nOperation\n\n\nObject Type\n\n\nFramework\n\n\n\n\n\n\n\n\n\n\nDetection/Tracking\n\n\nFace\n\n\nLBP-Based OpenCV\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nMOG w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nMotion\n\n\nSuBSENSE w/ STRUCK\n\n\n\n\n\n\nDetection/Tracking\n\n\nLicense Plate\n\n\nOpenALPR\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nSphinx\n\n\n\n\n\n\nDetection\n\n\nSpeech\n\n\nAzure Cognitive Services Batch Transcription API\n\n\n\n\n\n\nDetection\n\n\nScene\n\n\nOpenCV\n\n\n\n\n\n\nDetection\n\n\nClassification\n\n\nOpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification\n\n\nOpenCV DNN (YOLO)\n\n\n\n\n\n\nDetection/Tracking\n\n\nClassification/Features\n\n\nTensorRT (COCO classes)\n\n\n\n\n\n\nDetection\n\n\nText Region\n\n\nEAST\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nApache Tika\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nTesseract OCR\n\n\n\n\n\n\nDetection\n\n\nText (OCR)\n\n\nAzure Cognitive Services Read API\n\n\n\n\n\n\nDetection\n\n\nForm Structure (with OCR)\n\n\nAzure Cognitive Services Form Recognizer API\n\n\n\n\n\n\nDetection\n\n\nKeywords\n\n\nBoost Regular Expressions\n\n\n\n\n\n\nDetection\n\n\nImage (from document)\n\n\nApache Tika\n\n\n\n\n\n\nTranslation\n\n\nLanguage\n\n\nAzure Cognitive Services Translate API\n\n\n\n\n\n\nDetection\n\n\nLanguage\n\n\nfastText with the GlotLID model\n\n\n\n\n\n\n\n\nThe OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
"title": "Home"
},
{
"location": "/index.html#overview",
- "text": "There are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison. OpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection. For those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms. A list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here: Operation Object Type Framework Detection/Tracking Face LBP-Based OpenCV Detection/Tracking Motion MOG w/ STRUCK Detection/Tracking Motion SuBSENSE w/ STRUCK Detection/Tracking License Plate OpenALPR Detection Speech Sphinx Detection Speech Azure Cognitive Services Batch Transcription API Detection Scene OpenCV Detection Classification OpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color) Detection/Tracking Classification OpenCV DNN (YOLO) Detection/Tracking Classification/Features TensorRT (COCO classes) Detection Text Region EAST Detection Text (OCR) Apache Tika Detection Text (OCR) Tesseract OCR Detection Text (OCR) Azure Cognitive Services Read API Detection Form Structure (with OCR) Azure Cognitive Services Form Recognizer API Detection Keywords Boost Regular Expressions Detection Image (from document) Apache Tika Translation Language Azure Cognitive Services Translate API The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
+ "text": "There are numerous video and image exploitation capabilities available today. The Open Media Processing Framework (OpenMPF) provides a framework for chaining, combining, or replacing individual components for the purpose of experimentation and comparison. OpenMPF is a non-proprietary, scalable framework that permits practitioners and researchers to construct video, imagery, and audio exploitation capabilities using the available third-party components. Using OpenMPF, one can extract targeted entities in large-scale data environments, such as face and object detection. For those developing new exploitation capabilities, OpenMPF exposes a set of Application Program Interfaces (APIs) for extending media analytics functionality. The APIs allow integrators to introduce new algorithms capable of detecting new targeted entity types. For example, a backpack detection algorithm could be integrated into an OpenMPF instance. OpenMPF does not restrict the number of algorithms that can operate on a given media file, permitting researchers, practitioners, and developers to explore arbitrarily complex composites of exploitation algorithms. A list of algorithms currently integrated into the OpenMPF as distributed processing components is shown here: Operation Object Type Framework Detection/Tracking Face LBP-Based OpenCV Detection/Tracking Motion MOG w/ STRUCK Detection/Tracking Motion SuBSENSE w/ STRUCK Detection/Tracking License Plate OpenALPR Detection Speech Sphinx Detection Speech Azure Cognitive Services Batch Transcription API Detection Scene OpenCV Detection Classification OpenCV DNN (GoogLeNet, Yahoo NSFW, vehicle color) Detection/Tracking Classification OpenCV DNN (YOLO) Detection/Tracking Classification/Features TensorRT (COCO classes) Detection Text Region EAST Detection Text (OCR) Apache Tika Detection Text (OCR) Tesseract OCR Detection Text (OCR) Azure Cognitive Services Read API Detection Form Structure (with OCR) Azure Cognitive Services Form Recognizer API Detection Keywords Boost Regular Expressions Detection Image (from document) Apache Tika Translation Language Azure Cognitive Services Translate API Detection Language fastText with the GlotLID model The OpenMPF exposes data processing and job management web services via a User Interface (UI). These services allow users to upload media, create media processing jobs, determine the status of jobs, and retrieve the artifacts associated with completed jobs. The web services give application developers flexibility to use the OpenMPF in their preferred environment and programming language.",
"title": "Overview"
},
{