使用 OneNote API div 标记从捕获中提取数据Use OneNote API div tags to extract data from captures

适用于 OneDrive 上的消费者笔记本 | Office 365 上的企业级笔记本Applies to Consumer notebooks on OneDrive | Enterprise notebooks on Office 365

使用 OneNote API 从图像中提取名片数据,或从 URL 中提取食谱和产品数据。Use the OneNote API to extract business card data from an image, or recipe and product data from a URL.

提取属性Extraction attributes

若要提取和转换数据,只需在 create-pageupdate-page 请求中包含一个指定源内容、提取方法和回退行为的 div。To extract and transform data, simply include a div that specifies the source content, extraction method, and fallback behavior in your create-page or update-page request. API 以易于阅读的格式在页面上呈现提取的数据。The API renders extracted data on the page in an easy-to-read format.

<div
  data-render-src="image-or-url"
  data-render-method="extraction-method"
  data-render-fallback="fallback-action">
</div>

data-render-srcdata-render-src

内容源。The content source. 这可以是名片的图像或来自许多热门食谱或产品网站的绝对 URL。This can be an image of a business card or an absolute URL from many popular recipe or product websites. 必需。Required.

为在指定 URL 时获得最佳结果,请使用源网页的 HTML 中定义的规范 URL(如果已定义)。For best results when specifying a URL, use the canonical URL defined in the HTML of the source webpage, if one is defined. 例如,规范 URL 可能会在源网页中定义,如下所示:For example, a canonical URL might be defined in the source webpage like this:

<link rel="canonical" href="www.domainname.com/page/123/size12/type987" />

data-render-methoddata-render-method

要运行的提取方法。The extraction method to run. 必需。Required.

Value 说明Description
extract.businesscardextract.businesscard 名片提取。A business card extraction.
extract.recipeextract.recipe 食谱提取。A recipe extraction.
extract.productextract.product 产品列表提取。A product listing extraction.
提取extract 未知的提取类型。An unknown extraction type.

为获得最佳结果,如果你知道的话,请指定内容类型(extract.businesscardextract.recipeextract.product)。For best results, specify the content type (extract.businesscard, extract.recipe, or extract.product) if you know it. 如果类型未知,请使用 extract 方法,OneNote API 将尝试自动检测类型。If the type is unknown, use the extract method, and the OneNote API will try to auto-detect the type.

data-render-fallbackdata-render-fallback

提取失败时的回退行为。The fallback behavior if the extraction fails. 如果省略,默认值为 renderDefaults to render if omitted.

Value 说明Description
呈现render 呈现源图片或食谱或产品网页的快照。Renders the source image or a snapshot of the recipe or product webpage.
none 不执行任何操作。Does nothing.

如果希望除任何提取内容外,在页面上始终包含名片或网页的快照,此选项非常有用。This option is useful if you want to always include a snapshot of the business card or webpage on the page in addition to any extracted content. 请务必在请求中发送单独的 img 元素,如示例中所示。Be sure to send a separate img element in the request, as shown in the examples.

名片提取Business card extractions

OneNote API 尝试基于个人或公司的名片图片,查找和呈现以下联系人信息。The OneNote API tries to find and render the following contact information based on an image of a person's or company's business card.

  • 名称Name
  • 标题Title
  • 组织Organization
  • 电话和传真号码Phone and fax numbers
  • 邮寄和实际地址Mailing and physical addresses
  • 电子邮件地址Email addresses
  • 网站Websites
An example business card extraction

带有提取联系人信息的 vCard(.VCF 文件)也会嵌入到页面中。检索页面 HTML 内容时,vCard 是获取联系人信息的便捷方式。A vCard (.VCF file) with the extracted contact information is also embedded in the page. The vCard is a convenient way to get the contact information when retrieving page HTML content.

名片提取的常见情况Common scenarios for business card extractions

提取名片信息,并且呈现名片图像Extract business card information, and also render the business card image

指定 extract.businesscard 方法和 none 回退。Specify the extract.businesscard method and the none fallback. 此外,发送 src 属性同样引用图像的 img 元素。Also send an img element with the src attribute that also references the image. 如果 API 无法提取任何内容,则只会呈现名片图像。If the API is unable to extract any content, it renders the business card image only.

<div
    data-render-src="name:scanned-card-image"
    data-render-method="extract.businesscard"
    data-render-fallback="none">
</div>
<img src="name:scanned-card-image" />

提取名片信息,并且仅在提取失败时呈现名片图像Extract business card information, and render the business card image only if the extraction fails

指定 extract.businesscard 方法并使用默认的 render 回退。Specify the extract.businesscard method and use the default render fallback. 如果 API 无法提取任何内容,则会改为呈现名片图像。If the API is unable to extract any content, it renders the business card image instead.

<div
    data-render-src="name:scanned-card-image"
    data-render-method="extract.businesscard">
</div>

对于名片提取,图像作为多部分请求中的指定部件发送。For business card extractions, the image is sent as a named part in a multipart request. 请参阅添加图像和文件,查看有关说明如何在请求中发送图像的示例。See Add images and files for examples that show how to send an image in a request.

食谱提取Recipe extractions

OneNote API 尝试基于食谱的 URL 查找和呈现以下信息。The OneNote API tries to find and render the following information based on a recipe's URL.

  • 主图Hero image
  • 评级Rating
  • 食材Ingredients
  • 说明Instructions
  • 准备时间、烹饪时间和总时间Prep, cook, and total times
  • 份数Servings
An example recipe extraction

API 已针对很多热门网站(如 Allrecipes.comFoodNetwork.comSeriousEats.com)上的食谱进行了优化。The API is optimized for recipes from many popular sites such as Allrecipes.com, FoodNetwork.com, and SeriousEats.com.

食谱提取的常见情况Common scenarios for recipe extractions

提取食谱信息,并呈现食谱网页的快照Extract recipe information, and also render a snapshot of the recipe webpage

指定 extract.recipe 方法和 none 回退。Specify the extract.recipe method and the none fallback. 此外,发送 data-render-src 属性设置为食谱 URL 的 img 元素。Also send an img element with the data-render-src attribute set to the recipe URL. 如果 API 无法提取任何内容,则只会呈现食谱网页的快照。If the API is unable to extract any content, it renders a snapshot of the recipe webpage only.

本方案提供的信息可能最多,因为网页可能包含客户评论和建议等其他信息。This scenario potentially provides the most information because the webpage may include additional information, such as customer reviews and suggestions.

<div
    data-render-src="https://allrecipes.com/recipe/guacamole/"
    data-render-method="extract.recipe"
    data-render-fallback="none">
</div>
<img data-render-src="https://allrecipes.com/recipe/guacamole/" />

提取食谱信息,并且仅在提取失败时呈现食谱网页的快照Extract recipe information, and render a snapshot of the recipe webpage only if the extraction fails

指定 extract.recipe 方法,并使用默认呈现回退。Specify the extract.recipe method and use the default render fallback. 如果 API 无法提取任何内容,则会改为呈现食谱网页的快照。If the API is unable to extract any content, it renders a snapshot of the recipe webpage instead.

<div
    data-render-src="https://www.foodnetwork.com/recipes/alton-brown/creme-brulee-recipe.html"
    data-render-method="extract.recipe">
</div>

指定 extract.recipe 方法和 none 回退。Specify the extract.recipe method and the none fallback. 此外,发送 src 属性设置为食谱 URL 的 a 元素(也可以发送要添加到页面的任何其他信息)。Also send an a element with the src attribute set to the recipe URL (or you can send any other information you want to add to the page). 如果 API 无法提取任何内容,则只会呈现食谱链接。If the API is unable to extract any content, only the recipe link is rendered.

<div
    data-render-src="https://www.seriouseats.com/recipes/2014/09/diy-spicy-kimchi-beef-instant-noodles-recipe.html"
    data-render-method="extract.recipe"
    data-render-fallback="none">
</div>
<a href="https://www.seriouseats.com/recipes/2014/09/diy-spicy-kimchi-beef-instant-noodles-recipe.html">Recipe URL</a>

产品列表提取Product listing extractions

  • 标题Title
  • 评级Rating
  • 原始图象Primary image
  • 说明Description
  • 功能Features
  • 规格Specifications
An example product listing extraction

API 已针对很多热门网站(如 Amazon.comHomeDepot.com)上的产品进行了优化。The API is optimized for products from many popular sites such as Amazon.com and HomeDepot.com.

食谱提取的常见情况Common scenarios for recipe extractions

提取产品信息,并呈现产品网页的快照Extract product information, and also render a snapshot of the product webpage

指定 extract.product 方法和 none 回退。Specify the extract.product method and the none fallback. 此外,发送 data-render-src 属性设置为产品 URL 的 img 元素。Also send an img element with the data-render-src attribute set to the product URL. 如果 API 无法提取任何内容,则只会呈现产品网页的快照。If the API is unable to extract any content, it renders a snapshot of the product webpage only.

本方案提供的信息可能最多,因为网页可能包含客户评论和建议等其他信息。This scenario potentially provides the most information because the webpage may include additional information, such as customer reviews and suggestions.

<div
    data-render-src="https://www.amazon.com/Microsoft-Band-Small/dp/B00P2T2WVO"
    data-render-method="extract.product"
    data-render-fallback="none">
</div>
<img data-render-src="https://www.amazon.com/Microsoft-Band-Small/dp/B00P2T2WVO" />

提取产品信息,并且仅在提取失败时呈现产品网页的快照Extract product information, and render a snapshot of the product webpage only if the extraction fails

指定 extract.product 方法,并使用默认呈现回退。Specify the extract.product method and use the default render fallback. 如果 API 无法提取任何内容,则会改为呈现产品网页的快照。If the API is unable to extract any content, it renders a snapshot of the product webpage instead.

<div
    data-render-src="https://www.sears.com/craftsman-19hp-42-8221-turn-tight-174-hydrostatic-yard-tractor/p-07120381000P"
    data-render-method="extract.product">
</div>

指定 extract.product 方法和 none 回退。Specify the extract.product method and the none fallback. 此外,发送 src 属性设置为产品 URL 的 a 元素(也可以发送要添加到页面的任何其他信息)。Also send an a element with the src attribute set to the product URL (or you can send any other information you want to add to the page). 如果 API 无法提取任何内容,则只会呈现页面链接。If the API is unable to extract any content, only the page link is rendered.

<div
    data-render-src="https://www.homedepot.com/p/Active-Ventilation-5-Watt-Solar-Powered-Exhaust-Attic-Fan-RBSF-8-WT/204203001"
    data-render-method="extract.product"
    data-render-fallback="none">
</div>
<a href="https://www.homedepot.com/p/Active-Ventilation-5-Watt-Solar-Powered-Exhaust-Attic-Fan-RBSF-8-WT/204203001">Product URL</a>

未知内容类型提取Unknown content type extractions

如果不知道要发送的内容类型(名片、食谱还是产品),可以使用未限定的 extract 方法,让 OneNote API 自动检测类型。If you don't know the content type (business card, recipe, or product) that you're sending, you can use the unqualified extract method and let the OneNote API automatically detect the type. 如果你的应用发送不同的捕获类型,则你可能需要这样做。You might want to do this if your app sends different capture types.

注意: 如果确实知道要发送的内容类型,则应使用 extract.businesscardextract.recipeextract.product 方法。Note: If you do know the content type that you're sending, you should use the extract.businesscard, extract.recipe, or extract.product method. 在某些情况下,这将有助于优化提取结果。In some cases, this can help to optimize the extraction results.

未知提取的常见情况Common scenarios for unknown extractions

发送图像或 URL,并在提取失败时呈现提供的网页图像或快照Send an image or a URL, and render the supplied image or a snapshot of the webpage if the extraction fails

指定 extract 方法,以便 API 自动检测内容类型,并使用默认呈现回退。Specify the extract method so the API automatically detects the content type, and use the default render fallback. 如果 API 无法提取任何内容,则会改为呈现提供的图像或网页快照。If the API is unable to extract any content, it renders the supplied image or snapshot of the webpage instead.

<div
    data-render-src="some image or url"
    data-render-method="extract">
</div>

响应信息Response information

响应数据Response data 说明Description
成功代码Success code 成功的 POST 请求的 HTTP 状态代码为 201,成功的 PATCH 请求的 HTTP 状态代码为 204。A 201 HTTP status code for a successful POST request, and a 204 HTTP status code for a successful PATCH request.
错误Errors 请阅读 Microsoft Graph 中 OneNote API 的错误代码,以了解 Microsoft Graph 可以返回的 OneNote 错误。Read Error codes for OneNote APIs in Microsoft Graph to learn about OneNote errors that Microsoft Graph can return.

权限Permissions

若要创建或更新 OneNote 页面,需要请求相应的权限。To create or update OneNote pages, you'll need to request appropriate permissions. 选择应用运行所需的最低级别的权限。Choose the lowest level of permissions that your app needs to do its work.

POST 页面的权限Permissions for POST pages

  • Notes.CreateNotes.Create
  • Notes.ReadWriteNotes.ReadWrite
  • Notes.ReadWrite.AllNotes.ReadWrite.All

PATCH 页面的权限Permissions for PATCH pages

  • Notes.ReadWriteNotes.ReadWrite
  • Notes.ReadWrite.AllNotes.ReadWrite.All

有关权限范围及其工作方式的详细信息,请参阅 Microsoft Graph 权限引用For more information about permission scopes and how they work, see Microsoft Graph permissions reference.

另请参阅See also