‘Franko Arab’: writing Arabic words with English letters
The first time I met Brielle Nickoloff, Product Manager @ Botmock, she asked me if I’ve come across any functional differences when I design conversations in Arabic versus English. I didn’t have a clear answer so I promised her to keep it in mind and get back to her soon.
The question began marinating in my head. After a few days I responded with some of my initial ideas, and thought I’d share here with whomever is interested in the topic.
*Please note that my analysis of linguistic differences between nationalities, countries, or dialects in this article is not intended to generalize or discriminate. Rather, it is an attempt to share my personal experience as a bilingual speaker of Arabic and English who also happens to systematically analyze both languages from a technical perspective, when I design conversations in either language. My point of view is subject to critique (and critique is welcome!) I’m always open to thoughts, questions, and feedback. After all, I am not a linguist, so everything I’ll share is based on my own technical experience and knowledge designing conversations in both English and Arabic.
Now, let’s dive in!
Cultural before Linguistic
Unrelated to the fact that I am a conversation designer, as a native speaker of Arabic, I know there are important differences between Arabic, English, and many other languages. However, even before we think about the more tangible differences (like vocabulary, sounds, and inflection) I believe the sociolinguistic differences are most important to think about first. These are the cultural differences that can be felt through language.
I will try to break these down below.
Differences between Arabic and English and their Impact on Design/CUI
It is worth highlighting that the accuracy of these themes varies among the different Arab-speaking regions/countries and according to social segmentation and according to the rise of self-development and learning.
Linguistic patterns vary significantly among the various Arab countries/regions. Let me try to break down a bit:
Colloquial dictionaries are expanding quickly in Egypt and the spellings of these novel words change quickly as a result. The way someone spells a word often depends on their level of education and social class.
For example: As a non-Arabic speaker, the number “82” can be written using Arabic letters in more than 50 ways because of alternative letters and spacing. As is true in most languages, people generally don’t use the “textbook” spelling when texting, whether they’re talking to a friend or a chatbot. Machine Learning (ML) algorithms can help reduce the training effort needed for this to ensure high confidence values while interpreting a user’s utterance. (On a recent design I was working on, I wasn’t able to use an ML algorithm so I had to generate those synonyms for numbers from 1–200. It was quite hectic to be honest. 👀 )
It also is very common for Egyptians to speak/write English words (common nouns, verbs, etc…) using Arabic letters and sounds. That adds another dimension to training phrases unless your NLP system is smart enough to match non-Arabic words — since there isn’t just one way to spell or say the word. To clarify, the word “Packages” can be written as “باكدجز”, “باكدجس”, “باكدجيز”, “باكيدجيز” and God knows what else. 🙂
Because of these phenomena, it’s crucial that chatbot tuning in Egyptian is proactive and ongoing. There will always be new words that your user texts to the bot, and these need to be accounted for and added into the system’s design regularly.
The reverse is actually true in some parts of the Gulf. Franko Arab is what we call it when someone writes Arabic words in English letters.
One example: ‘Salam’ means ‘peace’ and is used alternatively for “goodbye” or sometimes for “hi”.
A second example: it is very common in e-commerce channels to ask about pricing by saying “H.M” (for ‘how much’) even if the conversation is in Arabic. I learned this the hard way when I started working in chatbots. 🙂
Even amongst Arabs, Egyptians are known to have their own version of Arabic slang that deviates the most from the standard Arabic.
Other Arab Regions
I can’t go into detail about each other country but I can comment on a high-level basis.
Levant (Lebanon, Jordan/Palestine, and Syria): these regions’ dialects are similar in syntax and phrases, though each have minor differences for some terminology. Implicatures abound in their conversations.
The North Africans (Libya, Tunisia, Algeria, and Morocco): In these regions, there are broad differences when it comes to education level and the heavy use (or mixing in) of other languages, like French, with Arabic
The same happens in Tunisia, Algeria, and Morocco: Moroccan is hardly understood by Arabic speakers since they mix Arabic, Berbers (Amazeej), and French in their conversations generating a totally new language that only they themselves can understand. 🙂
Impact of these geolinguistic patterns on Conversation Design
Anyone designing conversations in this region must pay special attention to vocabulary differences and slang-specific syntax.
And, like any other country in the world, different regions within the same country have different slang and terminology.
Script Styling still works for Arabic prompts
One more observation is that prompts scripting guidelines mostly work very well for scripting Arabic prompts. I initially wasn’t sure they will work as they seemed perfectly designed for the English language (and maybe English like languages).
Of course, not all principles apply since Arabic does not have Capitalization nor Contractions for instance, though other more significant principles like Focus on User, Lead with Benefits, End-User Focus and others remain extremely helpful to script sound conversations.
Voice Technology is still behind in Arabic slang
While there are a number of Arabic conversational interfaces with an NLP system that has been tuned to process and understand Arabic slang, there is still a big gap here, especially for Text-To-Speech (TTS) systems. Voice User Interface (VUI) platforms like Google Assistant and Amazon Alexa both support Modern Standard Arabic, MSA, which is a modernized hybrid textbook Arabic language (فصحى) and modern colloquial Arabic. No one speaks MSA other than news anchors, but even using MSA synthesized speech in VUIs is enough to totally disconnect end-users (receivers) from the conversation since it’s not engaging and sounds unnatural.
I believe this is a great area of opportunity that I hope CUI platform providers will invest in. This is the only way to increase the adoption of voice assistant powered-devices in the Arabic-speaking region.
Arabic slang and dialects vary widely among each other though they all share the same “Standard Arabic” that isn’t really even used in casual conversation for day-to-day communication.
My advice is to treat each as a separate language with possible some reuse of elements taken from others. It’s always better to avoid assuming that there’s a high level of similarity until proven otherwise during your work and user testing!
Finally, while I’m not a linguist, I hope my perspective was valuable as an Arabic speaker who has lived in Egypt and the Gulf while working within the conversation design domain.
Thank you for reading, and please share your point of view in the comments!
Check out this article and others about the conversation design on Botmock’s blog!