疏肝解郁喝什么茶| dw是什么意思| 神仙眷侣是什么意思| 惭愧的意思是什么| 便秘吃什么中成药| 乳腺增生吃什么药| 嗳气和打嗝有什么区别| 煎牛排用什么油好| 游走是什么意思| 拉肚子吃什么| 矽肺病是什么症状| 低钾会出现什么症状| 6月五行属什么| 闷骚是什么意思| 甲功不正常有什么表现| 6.15是什么星座| 欢是什么动物| 验孕棒一条杠什么意思| 从良是什么意思| 跛子是什么意思| 离是什么生肖| 11月27日是什么星座| 淋巴结肿大是什么样子的| 老友记是什么意思| 什么相马| 窦性心动过缓是什么意思| 量程是什么| 为什么会感染幽门螺旋杆菌| 鱼饼是什么做的| 什么是体液| 平板和ipad有什么区别| 尿检ph值是什么意思| 手上长斑点是什么原因| 晟什么意思| jio是什么意思| 必承其重上一句是什么| 7月30日什么星座| 布病挂什么科| 什么鱼最好养活| 龙眼是什么季节的水果| 纤维蛋白是什么| 尿白细胞弱阳性是什么意思| progress什么意思| 什么补肾效果最好| 血小板减少有什么危害| 平菇不能和什么一起吃| 梦到掉牙齿是什么意思| 吃银耳有什么好处和坏处| 弹颏是什么意思| 细胞核由什么组成| 永加日念什么| 敬请是什么意思| 三角梅用什么肥料最好| 狒狒是什么动物| ppm是什么单位| 皮下出血小红点是什么原因造成的| 痢疾吃什么药| 猪与什么属相相冲| 银河系是什么| 望梅止渴是什么梅| 灰飞烟灭是什么意思| 脸上长痘挂什么科| 小孩腰疼是什么原因引起的| 化疗后白细胞低吃什么补得快| 臆想症是什么意思| 单核细胞比率偏高是什么意思| 前列腺增大是什么原因| 芥蒂什么意思| 高铁动力来源是什么| 天衣无缝什么意思| 466是什么意思| 什么是蝴蝶效应| 飘飘然是什么意思| 2004是什么年| 1948属什么生肖| 此是什么意思| 囊肿和肿瘤有什么区别| 死而什么什么| 积什么成什么| 经常手淫会导致什么| 过敏性紫癜用什么药| 吃什么食物降血压最快最好| 应届是什么意思| 唐筛都检查什么| 辣椒是什么时候传入中国的| 氯偏高是什么原因| 孕早期宫缩是什么感觉| 繁衍的衍是什么意思| 喉咙痒想咳嗽吃什么药| 55年属什么生肖| 什么是萎缩性胃炎| 福禄寿的禄是什么意思| 黑豆不能和什么一起吃| 装模作样是什么生肖| 梦见自己和别人吵架是什么意思| 下午16点是什么时辰| 聚什么会什么| 什么鲸鱼最大| 心律失常吃什么药| 2002年出生属什么| 面包糠是什么做的| 左手指头麻木是什么原因| 眉骨疼是什么原因| 胎儿腹围偏大说明什么| 眼镜pd是什么意思| angelababy是什么意思| iu是什么单位| 12月26日什么星座| 扁桃体1度肿大是什么意思| 扁桃体发炎能吃什么水果| 树木什么| 81年属鸡的是什么命| 7月4日什么星座| 一什么眉毛填量词| 水弹是什么材料| 房性早搏是什么意思| 眉毛痒是什么原因| 半边脸肿是什么原因引起的| 安大爷是什么意思| 富贵包挂什么科| 尿频吃什么药效果最好| 什么东西快速补血| 肋骨外翻挂什么科| 小孩荨麻疹吃什么药| 尿蛋白高是什么原因引起的| 肾功能三项检查什么| 叶仙是什么植物| 爱理不理是什么意思| 火烧是什么| 业已毕业是什么意思| 益安宁丸主治什么病| 骨髓不造血是什么病| 一代明君功千秋是什么生肖| 官杀是什么意思| 庶母是什么意思| 吃榴莲对妇科病有什么好处| 梦见别人送钱给我是什么意思| 孩子肚子疼是什么原因| 转铁蛋白阳性什么意思| 脾胃虚弱有什么症状| 六味地黄丸什么人不能吃| 尿肌酐低说明什么| 毛泽东是什么样的人| 后脑袋疼是什么原因| 做健身教练有什么要求| 7月4号是什么星座| 枸杞有什么作用和功效| 天天晚上睡觉做梦是什么原因| plano是什么意思| 同房出血要做什么检查| 网罗是什么意思| 300年前是什么朝代| 最亮的星星是什么星| u盘什么牌子好| 薏米是什么米| 不小心怀孕了吃什么药可以流掉| 流产后吃什么水果最佳| 天空为什么是蓝色的| 喝酒前喝什么不容易醉又不伤胃| 呕吐挂什么科| 寝不言食不语什么意思| 家庭烧烤准备什么食材| au是什么金属| 土茯苓与茯苓有什么区别| 什么样的人做什么样的事| 做脑电图挂什么科| 10月9号是什么星座| 口苦口臭吃什么药效果最佳| 古驰是什么牌子| 精神分裂吃什么药| 宫缩是什么感觉| 夏天容易出汗是什么原因| 下属是什么意思| 只出不进什么意思| 中午一点半是什么时辰| 7月5号是什么星座| 附件是什么意思| 黄眉大王是什么妖怪| 蚂蚁搬家是什么意思| 副处级干部是什么级别| 头皮发红是什么原因| 肾在什么位置图片| 头晕有点恶心是什么原因| 牙疼吃什么食物好得快| 魔芋爽是什么做的| 皮肤黑穿什么颜色显白| 98年属相是什么| 什么海翻江| 同比增长是什么意思| 与虎谋皮什么意思| 脾胃不和吃什么中成药| 五条杠什么牌子| 嗓子干痒吃什么药效果好| 雾化对小孩有什么影响或者副作用| 眼睛有点模糊是什么原因| 经期头疼吃什么药效果最好| 吃牛肉有什么好处| 打豆豆是什么意思| 平光镜是什么意思| 咳嗽两个月了一直不好是什么原因| 什么水果营养价值最高| 月经来了喝红糖水有什么好处| 免疫力低下吃什么| 湿疹是什么症状图片| 乙肝15阳性什么意思| 什么什么不周| 午夜是什么意思| 升结肠憩室是什么意思| 三月十六是什么星座| 橘络的功效与作用是什么| 百日咳是什么引起的| 性激素六项什么时候查最准确| PSV是什么意思| 外阴白斑是什么症状| 白细胞低是怎么回事有什么危害| 吉士是什么| 共襄盛举是什么意思| 竹子可以做什么玩具| 苹果a1661是什么型号| 胃炎糜烂吃什么食物好| 脑卒中是什么意思| 3p什么意思| 脚掌痒是什么原因| 男人为什么会得尿结石| 戍什么意思| 血糖高可以吃什么主食| 7月22号是什么星座| 关节炎吃什么药好得快| 梦见小葱是什么意思| 什么水果去火| 耳语是什么意思| 马日冲鼠是什么意思| 舌根发麻是什么原因| 刮痧的痧是什么东西| 520和521的区别是什么| 除日是什么意思| 甲减吃什么盐| poems综合征是什么病| 鸡内金有什么功效| 喝苹果醋有什么好处和坏处| 拉油便是什么原因| 奇怪的什么| 中医学是什么| 人黄是什么| 脑袋疼是什么原因| 阴湿是什么意思| 肺看什么科室| 黑眼圈严重是什么原因| 开封菜是什么意思| 什么是紫河车| 西双版纳有什么好玩的地方| 双头蛇是什么意思| 乙肝通过什么途径传染| 7月13号是什么星座| 滑丝是什么意思| 小儿发烧吃什么食物好| 彦五行属性是什么| 为什么不呢| 彼此彼此什么意思| 衤字旁的字与什么有关| 何许人也是什么意思| 龙和什么属相相克| 为什么高铁没有e座| 张国立老婆叫什么名字| 百度
Skip to content

webmachinelearning/proofreader-api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

?

History

17 Commits
?
?
?
?
?
?
?
?
?
?
?
?

Repository files navigation

Proofreader API Explainer

This proposal is an early design sketch by ODML and Chrome built-in AI team to describe the problem below and solicit feedback on the proposed solution. It has not been approved to ship in Chrome.

Proofreading is the process of examining a text carefully to find and correct errors such as grammar, spelling, and punctuation to generate an error-free text before it is published or shared. Browsers and operating systems are increasingly offering proofreading capability to help their users compose (examples: Example, Example).

Web applications can also benefit from such proofreading capability. This proposal introduces a new JavaScript API which, by exposing high-level functionality of a language model, corrects and labels a variety of errors from user input. Specifically, the proposed proofreading API in this explainer exposes three specific higher-level functionalities for proofreading:

  1. Error Correction: Correct input text by the user
  2. Error Labeling: For each correction made to each error in the input text, label the error type (e.g. spelling, punctuation, etc.)
  3. Error Explanation: Annotates each error with a plain language explanation

Note that Labeling & Explanation are independent features that can be either added or dropped.

Goals

Our goals are to:

  • Help web developers perform real-time proofreading (e.g. of user input) on short phrases/sentences/paragraphs of freeform text.
  • Allow web developers to build flexible proofreading UI/UX.
  • Offer higher-level APIs with specific inputs and output formats that can support error labeling and explanations, abstracting away the underlying implementation (e.g. OS feature, language model, etc.).
  • Enable progressive enhancement, so web developers can gracefully handle varying levels of user agent support.

The following are explicit non-goals:

  • Proofreading for markdown or other formats/syntaxes (e.g. not intended for JS code)
  • Check for consistent style and formatting throughout a user provided input

Use cases

  • Proofread and suggest corrections to user messages in chat applications
  • Proofread and help polish email drafting
  • Catch errors and provide corrections during note-taking
  • Proofread a comment to a forum/article/blog
  • Provide high quality interactive proofreading along with labeling & explanations for the correction when writing documents

Examples

Basic usage

Create a proofreader object customized as necessary, and call its method to proofread an input:

const proofreader = await Proofreader.create({
  includeCorrectionTypes: true,
  includeCorrectionExplanations: true,
});

const corrections = await proofreader.proofread("I seen him yesterday at the store, and he bought two loafs of bread.");

proofread() corrects the input text and returns a list of corrections made. Additional proofreading features can be configured using includeCorrectionTypes and includeCorrectionExplanations. When includeCorrectionTypes is set to true, proofread() will provide an error type label for each correction made to each error. When includeCorrectionExplanations is set to true, proofread() will provide an annotation for each error with a plain language explanation.

Detailed design for the corrections output is discussed later.

Repeated usage

A created proofreader object can be used multiple times. The only shared state is the initial configuration options; the inputs do not build on each other.

const proofreader = await Proofreader.create();

editBoxEl.addEventListener("blur", async (event) => {
  const corrections = await proofreader.proofread(event.target.value);
});

Expected input languages

The default behavior for the proofreader object assumes that the input language is unknown. In this case, implementations will use whatever "base" capabilities they have available for these operations, and might throw "NotSupportedError" DOMExceptions if they encounter languages they don't support.

It’s better practice, if possible, to supply the create() method with information about the expected languages in use. This allows the implementation to download any necessary supporting material, such as fine-tunings or safety-checking models, and to immediately reject the promise returned by create() if the web developer wants to use languages that the browser is not capable of supporting:

const proofreader = await Proofreader.create({
  includeCorrectionTypes: true,
  expectedInputLanguages: ["en"],
});

Expected explanation language

When explanations for corrections are requested for the proofreading result, the default behavior for the proofreader object assumes that the explanation language is unknown and will be the same as the input language.

Similar to input languages, it’s better practice, if possible, to supply the create() method with the expected explanation languages.

const proofreader = await Proofreader.create({
  includeCorrectionExplanations: true,
  expectedInputLanguagues: ["en"],
  correctionExplanationLanguage: "en",
});

Multilingual content

When there are multiple languages in the proofreading input, developers could specify them by adding to the list of expectedInputLanguages in the create() method.

const proofreader = await Proofreader.create({
  includeCorrectionTypes: true,
  expectedInputLanguages: ["en", "ja"],
})

Testing available options before creation

The proofreading API is customizable during the create() calls, with various options including the language option above. All options are given in more detail in the later section.

However, not all models will necessarily support every language and it might require a download to get the appropriate fine-tuning or other collateral necessary on the first use.

In the simple case, web developers should call create(), and handle failures gracefully. However, if they want to provide a differentiated user experience, which lets users know ahead of time that the feature will not be possible or might require a download, they can use the API’s promise-returning availability() method. This method lets developers know, before calling create(), what is possible with the implementation.

The method will return a promise that fulfills with one of the following availability values: “unavailable” means that the implementation does not support the requested options. “downloadable” means that the implementation supports the requested options, but it will have to download something (e.g. machine learning model or fine-tuning) before it can do anything. “downloading” means that the implementation supports the requested options, but it will have to finish an ongoing download before it can do anything. “available” means that the implementation supports the requested options without requiring any new downloads.

An example usage is the following:

const options = { includeCorrectionTypes: true, expectedInputLanguages: ["en"] };

const supportsOurUseCase = await Proofreader.availability(options);

if (supportsOurUseCase !== "unavailable") {
  // We're good! Let's do the proofreading using the built-in API.
  if (supportsOurUseCase !== "available") {
    console.log("Sit tight, we need to do some downloading...");
  }
  const proofreader = await Proofreader.create(options);
  console.log(await proofreader.proofread(editBoxEl.textContent));
} else {
  // Either the API overall, or the combination of correction-with-labels with
  // English input, is not available.
  // Handle the failure / run alternatives.
}

Download progress

For cases where using the API is only possible after a download, you can monitor the download progress (e.g. in order to show your users a progress bar) using code such as the following:

const proofreader = await Proofreader.create({
  ...otherOptions,
  monitor(m) {
    m.addEventListener("downloadprogress", e => {
      console.log(`Downloaded ${e.loaded * 100}%`);
    });
  }
};

If the download fails, then downloadprogress events will stop being fired, and the promise returned by create() will be rejected with a "NetworkError" DOMException.

Note that some implementations might require multiple entities to be downloaded, e.g., a base model plus a LoRA fine-tuning. In such a case, web developers do not get the ability to monitor the individual downloads. All of them are bundled into the overall downloadprogress events, and the create() promise is not fulfilled until all downloads and loads are successful.

Destruction and aborting

The API comes equipped with a couple of signal options that accept AbortSignals, to allow aborting the creation of the proofreader, or the operations themselves:

const controller = new AbortController();
stopButton.onclick = () => controller.abort();

const proofreader = await Proofreader.create({ signal: controller.signal });
await proofreader.proofread(document.body.textContent, { signal: controller.signal });

Additionally, the proofreader object itself has a destroy() method, which is a convenience method with equivalent behavior for cases where the proofreader object has already been created.

Destroying a proofreader will:

Reject any ongoing operations (proofread()). And, most importantly, allow the user agent to unload the machine learning models from memory. (If no other APIs are using them.) Allowing such destruction provides a way to free up the memory used by the language model without waiting for garbage collection, since models can be quite large.

Aborting the creation process will reject the promise returned by create(), and will also stop signaling any ongoing download progress. (The browser may then abort the downloads, or may continue them. Either way, no further downloadprogress events will be fired.)

Detailed design discussion

Proofreading correction output

For each input, the method proofread() returns a promise of ProofreadResult:

dictionary ProofreadResult {
  DOMString corrected;
  sequence<ProofreadCorrection> corrections;
}

corrected is the fully corrected version of the input, while corrections contains a list of corrections made, their locations in the original input (e.g. so web developers can create UI to highlight the error), and optionally labels/explanations.

dictionary ProofreadCorrection {
  unsigned long long startIndex;
  unsigned long long endIndex;
  DOMString correction;
  CorrectionType type; // exists if proofreader.includeCorrectionTypes === true
  DOMString explanation; // exists if proofreader.includeCorrectionExplanations === true
}

enum CorrectionType { "spelling", "punctuation", "capitalization", "preposition", "missing-words", "grammar" };

type only exists when the proofreader object is configured with includeCorrectionTypes = true, while explanation only exists when the proofreader object is configured with includeCorrectionExplanations = true.

Not all correction types here will be applicable to all languages, and in the future we might propose more specific correction types. The generic catch-all type, if no more-specific type matches, is "grammar".

To get an error in the input, use input.substring(startIndex, endIndex). Corrections in the corrections list will be organized in ascending order based on the startIndex of the correction.

Example usage of the output to highlight error in input:

let inputRenderIndex = 0;

for (const correction of corrections) {
  // Render part of input that has no error.
  if (correction.startIndex > inputRenderIndex) {
    const unchangedInput = document.createElement('span');
    unchangedInput.textContent = input.substring(inputRenderIndex, correction.startIndex);
    editBox.append(unchangedInput);
  }
  // Render part of input that has an error and highlight as such.
  const errorInput = document.createElement('span');
  errorInput.textContent = input.substring(correction.startIndex, correction.endIndex);
  errorInput.classList.add('error');
  editBox.append(errorInput);
  inputRenderIndex = correction.endIndex;
}

// Render rest of input that has no error.
if (inputRenderIndex !== input.length){
  const unchangedInput = document.createElement('span');
  unchangedInput.textContent = input.substring(inputRenderIndex, input.length);
  editBox.append(unchangedInput);
}

Full API surface in Web IDL

[Exposed=(Window,Worker), SecureContext]
interface Proofreader {
  static Promise<Proofreader> create(optional ProofreaderCreateOptions options = {});
  static Promise<AIAvailability> availability(optional ProofreaderCreateCoreOptions options = {});

  Promise<ProofreadResult> proofread(
    DOMString input,
    optional ProofreaderProofreadOptions options = {}
  );
  ReadableStream proofreadStreaming(
    DOMString input,
    optional ProofreaderProofreadOptions options = {}
  );

  // whether to provide correction types for each correction as part of the proofreading result.
  readonly attribute boolean includeCorrectionTypes;
  // whether to provide explanations for each correction as part of the proofreading result.
  readonly attribute boolean includeCorrectionExplanations;
  readonly attribute DOMString? correctionExplanationLanguage;
  readonly attribute FrozenArray<DOMString>? expectedInputLanguages;

  undefined destroy();
};

dictionary ProofreaderCreateCoreOptions {
  boolean includeCorrectionTypes = false;
  boolean includeCorrectionExplanations = false;
  DOMString correctionExplanationLanguage;
  sequence<DOMString> expectedInputLanguages;
};

dictionary ProofreaderCreateOptions : ProofreaderCreateCoreOptions {
  AbortSignal signal;
  AICreateMonitorCallback monitor;
};

dictionary ProofreaderProofreadOptions {
  AbortSignal signal;
};

dictionary ProofreadResult {
  DOMString correctedInput;
  sequence<ProofreadCorrection> corrections;
};

dictionary ProofreadCorrection {
  unsigned long long startIndex;
  unsigned long long endIndex;
  DOMString correction;
  CorrectionType type;
  DOMString explanation;
};

enum CorrectionType {
  "spelling",
  "punctuation",
  "capitalization",
  "preposition",
  "missing-words",
  "grammar"
};

Alternatives considered and under consideration

Provide explanations only asynchronously

To offer a more comprehensive proofreading API, in addition to labeling the error type for each correction made, we considered annotating each correction with an explanation. Users of such proofreading capability can benefit from it to improve their writing skills.

However, due to technical limitations of the on-device language model, generating a short explanation for each correction takes significantly longer than real-time, not to mention multiple explanations for all corrections within a short sentence/paragraph.

To address this, we propose to only offer streaming explanations asynchronously from the list of corrections (ProofreadCorrection) through a streaming API. Specifically, instead of returning explanations for all corrections at one time, we would return one correction’s explanation at a time as they become available. This way, web developers can provide sooner UI updates to the users to make the experience less jarring.

Interaction with other browser integrated proofreading feature

As web developers implement UX around this proofreading API, if users’ browser supports other integrated proofreading features, the UX could get confusing with two features trying to help at once.

The spellcheck attribute from HTML available across browsers might help developers to signal to the browser to turn off its integrated spelling check if it has one. For example, when spellcheck is set to false, no red underlines/squiggly lines will appear to indicate a spelling error.

For more sophisticated browser integrated proofreading features, it’s an open question how to address the potential conflicts. For example, for browser extensions, one option is for web developers to detect the presence of certain extensions and then decide the behavior of their own proofreading feature.

Customization with user-mutable dictionary

While the proposed Proofreading API corrects user input based on general knowledge, there could be cases where users would prefer to ignore correcting certain proper names, acronyms, etc. For example, the proposed Dictionary API allows users to add and remove words from the browser’s custom dictionary to address special use cases.

The Proofreading API can potentially allow users to specify a custom dictionary, and avoid correcting any words included in the dictionary.

However, in cases where ignoring certain words for correction could potentially change the meaning/structure of a sentence, it could be a bit tricky to proofread with pre-trained language models. Therefore, we are moving forward without integration with custom dictionaries until further exploration and evaluation is done. Nevertheless, we invite discussion of all of these APIs within the Web Machine Learning Community Group.

About

?? An API to help web users perform real-time proofreading of freeform text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
吃金针菇有什么好处 小孩补钙吃什么最好 什么是化学 年金是什么意思 如初是什么意思
什么是周边 病毒感染会有什么症状 坤字五行属什么 建成区是什么意思 84消毒液不能和什么一起用
了解是什么意思 卵子排出体外是什么样子 血小板计数高是什么原因 ct和b超有什么区别 甲醛是什么气味
芒果什么人不适合吃 农历9月11日是什么星座 人次是什么意思 什么是精神病 张艺谋为什么不娶巩俐
幽门螺杆菌吃什么药好hcv7jop5ns4r.cn 痧是什么jasonfriends.com 皮癣用什么药膏liaochangning.com 怀疑心衰做什么检查hcv7jop6ns4r.cn 请结合临床是什么意思dayuxmw.com
沛是什么意思hcv8jop7ns0r.cn 梦到生孩子是什么意思xinmaowt.com 无休止是什么意思hcv8jop2ns2r.cn 肺结核是什么病hcv9jop0ns3r.cn 痴男怨女是什么意思hcv8jop2ns7r.cn
什么是百慕大三角hcv7jop9ns8r.cn 孔雀开屏寓意什么意思hcv7jop9ns1r.cn veromoda是什么牌子hcv8jop9ns4r.cn 梦见一坨屎是什么意思hanqikai.com 什么是爱豆hcv7jop6ns2r.cn
慢性气管炎吃什么药最有效tiangongnft.com 什么的海藻hcv8jop3ns8r.cn 小米粥和什么搭配最好hcv7jop9ns2r.cn 7.11什么星座hcv7jop7ns0r.cn 宫颈异常是什么意思1949doufunao.com
百度