Jump to content

مينت

From mediawiki.org
This page is a translated version of the page MinT and the translation is 18% complete.

مينت (MinT - Machine in Translation) هي خدمة ترجمة آلية ترتكز على نماذج الترجمة الآلية العصبية مفتوحة المصدر. يتم استضافة الخدمة في بنية تحتية من مؤسسة ويكيميديا، وتشغيل نماذج الترجمة التي تم إصدارها من قبل منظمات أخرى مع ترخيص مفتوح المصدر. يمكن أن تكون خدمة الترجمة الآلية المفتوحة جزءًا أساسيًا من البنية التحتية الأساسية للنظام البيئي للمعرفة الحرة. هذه الصفحة تظهر المبادرات التي تُتَخذ لتنمية الخدمة وتوفير هذه البنية التحتية على نطاق واسع.

يمكنكم تجربة مينت كجزء من مشاريع مثل ترجمة المحتوى و translatewiki.net، أو مباشرة في بيئة تجريبية.

Overview of MinT initiatives

Machine translation can be useful in different contexts. As more products make use of MinT for different purposes, it is useful to differentiate those different contexts. In this way, when users report a bug it is more clear where it needs to be fixed.

  • MinT Service. The backend service running open-source neural machine translation models.
    • MinT test instance. A basic interface to try the different translation models.
  • MinT for Translators. Initiative to integrate the MinT Service with tools that support other machine translaiton services such as Content Translation and the Translate Extension.
    • MinT Client for Content Translation. Client exposing the MinT Service as one of the machine translation services available in Content Translation.
    • MinT Client for Translate extension. Client exposing the MinT Service as one of the machine translation services available in the Translate extension.
  • MinT for Wiki Readers. Product to enable readers to use machine translation to read contents from other languages on a wiki.

You can read more below about each of the MinT initiatives.

شاركوا

Feel free to share any feedback in the discussion page. Planned improvements are captured in Phabricator (more info), you can report wrong behavior or propose feature enhancements, track the progress of any task, and share your perspective on it. For completed work you can also check the status updates below.

MinT Service

The MinT Service is designed to provide translations from multiple machine translation models. Currently, it uses the following models:

MinT supports over 200 languages, with more than 70 languages not supported by other services (including 27 languages for which there is no Wikipedia yet). You can read more about the initial release of MinT and check some frequently asked questions in the summary page for the service.

Technical details

The translation models have been optimized for performance using OpenNMT Ctranslate2 library in order to avoid the need for GPU acceleration. This makes it easier for organizations and individuals to build and run their own instances. For more details you can check the following:

MinT provides a platform to run multiple translation models. In order to support different initiatives, aspects such as sentence segmentation, language detection, pre/post-processing of contents, and rich format support has been developed on top of the plain-text based models.

Test instance

The MinT test instance is a basic interface to try the different translation models. It allow to translate contents across the selected language pairs and select the preferred translation model when multiple are available. This allows different communities to check how well the models support their language. This instance is intended for testing, so performance and availability may be reduced compared to other MinT-based products. You can check the availability status of the MinT test instance.

MinT for translators

Mobile translation using MinT

Translation is a common way to contribute in the Wikimedia ecosystem for multilingual users. Machine translation can provide a useful initial translation for users to review and improve. The Language team has developed tools to support translations in their workflows that can integrate different machine translation services to speed up their processes. Once MinT was available, integrating it with these tools was a logical next step to amplify their impact. MinT is available in the following projects:

  • Content Translation. Content Translation provides guidance to create a translation of a Wikipedia article into another language.

Content Translation integrates several translation services to provide an initial translation. You can check which languages supported by MinT are available in Content Translation

  • Localization infrastructure. The Translate extension provides the infrastructure used to translate our software and multilingual pages.

Communities of translators use it on translatewiki.net, Wikimedia Meta-wiki, MediaWiki.org and more.

مينت لقراء الويكي

عدد المواضيع ومقدار المعلومات التي يمكن للقارئين تعلمها من ويكيبيديا والويكي الأخرى يعتمد على اللغات التي يتحدثونها. ترجمة الآلة يمكن أن تساعد الناس على معرفة المزيد عن الموضوعات التي تهمهم عندما لا يكون المحتويات متوفرة في لغتهم.

تستكشف هذه المبادرة كيفية إظهار دعم الترجمة الآلية من مينت في مقالات ويكيبيديا بطريقة:

  • تسمح للقراء بمعرفة المزيد عن الموضوعات المثيرة للاهتمام من اللغات الأخرى.
  • تفرّق بوضوح المحتوى المولد تلقائياً عن المحتوى الذي تم إنشاؤه من قبل المجتمع.
  • تشجع على الوصول للمحتوى الذي تم إنشاؤه من قبل المجتمع وتساهم به عندما يكون ذلك ممكناً.

في الوقت الحالي، يعمل فريق اللغة على التطبيقات الأولية لهذه المبادرة بناءً على الأبحاث والتصاميم. ستحدد الدروس المبنية على البيانات ومدخلات المجتمع الخطوات التالية للمبادرة.

MinT more widely available

Working on the previous initiatives will help to polish and solidify the system. For now, the MinT API is only available for Wikimedia products. As the system gets ready, we'll consider a wider exposure. Providing a service that can be used by communities in innovative ways can be a very powerful tool. New initiatives to make MinT more widely available will be captured here in the future. Meanwhile, feel free to configure your own MinT instance to experiment with it.

Disclaimer

  1. Accuracy of MinT’s Translations: The accuracy of translations generated by MinT may vary. Translations may not be entirely accurate or may not always convey the intended meaning or context of the original content. Wikimedia makes no representations or warranties regarding the accuracy or adequacy of the automatically translated content.
  2. Limitation of Liability: Wikimedia, its affiliates, and employees are not liable for any direct, indirect, incidental, punitive, or consequential damages, including but not limited to damages for goodwill, use, data, or any other intangible losses arising out of or in connection with the use of MinT or translations generated with MinT.
  3. Creative Commons Compliance: Translations generated with MinT are considered derivative works under the applicable Creative Commons license governing the original content. Users shall comply with the terms of the applicable Creative Commons license when using translated content.
  4. Terms of Use and Privacy Policy: Use of MinT is subject to Wikimedia's Terms of Use and Privacy Policy.

Status updates

فبراير 2024

يناير 2024

ديسمبر 2023

نوفمبر 2023

أكتوبر 2023

  • Launched the Language Identification service to automatically detect in which language is written a given text. The service supports the detection of 201 languages, and anyone can access the API to use the service or read the model card for more details. Machine Learning team completed the last checks after deploying to LiftWing and evaluating that the service can "easily withstand a high amount of traffic".
  • Basic support for rich text translation by supporting transferring of markup to apply styling such as words in bold from the source text into the equivalent ones in the machine translation (which lacks format since translation models operate with plain-text).
  • Completed the process to enable MinT for languages with no Wikipedia yet . Translation models in MinT support 25 languages for which there is no Wikipedia. These can be tested in MinT's test instance for speakers of those languages to assess quality, and ensures that translation tools are well-equipped once wikis are created for those languages (as it has been the case with the recent graduation of Fon Wikipedia out of incubator).
  • Completed the process to enable MinT for closely-related languages based on Community input . For some languages where machine translation is not available, Wikipedia editors have asked to have access to machine translation in Content Translation using a related language instead of having no support at all. With this enablement translators of Gan (gan) Wikipedia will have machine translation based on the traditional script variant of Chinese as a starting point.
  • Analysis of translation activity on 55 languages for which MinT provides machine translation for the first time shows how (a) translations have increased 2X since MinT is available, and (b) deletion rates have not increased. Activity levels for these 55 wikis changed from ~500 translations/month, to 1K+ translations/month after MinT was enabled. For example, a recent peak of 2.15K translations were published in August 2023 when MinT was available for those languages, which is a significant increase from 225 translations in August 2022 when MinT was not available for them.
  • Better visibility of translation quality by including a tag in translations where unedited machine translation is close to the limits. This will facilitate analysis about translation quality and limits.

سبتمبر 2023

A message well received by the attendees.

  • Research planning started with an initial draft of the research brief for MinT on Wikipedia
  • Continuing technical explorations for applying machine translation beyond plain text (what underlying models provide) to support the Wikipedia context: A new improved approach for sentence segmentation (with a demo page to try) that provides a more accurate way to identify when a sentence ends in different languages, and with a preference to avoid splitting in case of doubt (preferred in the context of machine translation to avoid fragmenting the context of a translation, for example, misinterpreting the dot of an abbreviation as a fullstop).

أغسطس 2023

يوليو 2023