Convert Unicode Accented Characters to Unaccented in C++

A detailed guide on converting Unicode accented characters to unaccented characters in C++ using the `` library. This article helps you handle Vietnamese text more effectively.

This article provides a guide on converting Unicode accented characters to unaccented ones in C++. We will use a character mapping table to replace accented characters with their unaccented equivalents.

C++ code

#include <iostream>
#include <unordered_map>
#include <string>

// Function to convert accented characters to unaccented
std::string removeAccents(const std::string &input) {
    // Mapping table from accented to unaccented characters
    std::unordered_map<char, char> charMap = {
        {'á', 'a'}, {'à', 'a'}, {'ả', 'a'}, {'ã', 'a'}, {'ạ', 'a'},
        {'ă', 'a'}, {'ắ', 'a'}, {'ằ', 'a'}, {'ẳ', 'a'}, {'ẵ', 'a'}, {'ặ', 'a'},
        {'â', 'a'}, {'ấ', 'a'}, {'ầ', 'a'}, {'ẩ', 'a'}, {'ẫ', 'a'}, {'ậ', 'a'},
        {'é', 'e'}, {'è', 'e'}, {'ẻ', 'e'}, {'ẽ', 'e'}, {'ẹ', 'e'},
        {'ê', 'e'}, {'ế', 'e'}, {'ề', 'e'}, {'ể', 'e'}, {'ễ', 'e'}, {'ệ', 'e'},
        {'í', 'i'}, {'ì', 'i'}, {'ỉ', 'i'}, {'ĩ', 'i'}, {'ị', 'i'},
        {'ó', 'o'}, {'ò', 'o'}, {'ỏ', 'o'}, {'õ', 'o'}, {'ọ', 'o'},
        {'ô', 'o'}, {'ố', 'o'}, {'ồ', 'o'}, {'ổ', 'o'}, {'ỗ', 'o'}, {'ộ', 'o'},
        {'ơ', 'o'}, {'ớ', 'o'}, {'ờ', 'o'}, {'ở', 'o'}, {'ỡ', 'o'}, {'ợ', 'o'},
        {'ú', 'u'}, {'ù', 'u'}, {'ủ', 'u'}, {'ũ', 'u'}, {'ụ', 'u'},
        {'ư', 'u'}, {'ứ', 'u'}, {'ừ', 'u'}, {'ử', 'u'}, {'ữ', 'u'}, {'ự', 'u'},
        {'ý', 'y'}, {'ỳ', 'y'}, {'ỷ', 'y'}, {'ỹ', 'y'}, {'ỵ', 'y'},
       

 {'đ', 'd'},
        {'Á', 'A'}, {'À', 'A'}, {'Ả', 'A'}, {'Ã', 'A'}, {'Ạ', 'A'},
        {'Ă', 'A'}, {'Ắ', 'A'}, {'Ằ', 'A'}, {'Ẳ', 'A'}, {'Ẵ', 'A'}, {'Ặ', 'A'},
        {'Â', 'A'}, {'Ấ', 'A'}, {'Ầ', 'A'}, {'Ẩ', 'A'}, {'Ẫ', 'A'}, {'Ậ', 'A'},
        {'É', 'E'}, {'È', 'E'}, {'Ẻ', 'E'}, {'Ẽ', 'E'}, {'Ẹ', 'E'},
        {'Ê', 'E'}, {'Ế', 'E'}, {'Ề', 'E'}, {'Ể', 'E'}, {'Ễ', 'E'}, {'Ệ', 'E'},
        {'Í', 'I'}, {'Ì', 'I'}, {'Ỉ', 'I'}, {'Ĩ', 'I'}, {'Ị', 'I'},
        {'Ó', 'O'}, {'Ò', 'O'}, {'Ỏ', 'O'}, {'Õ', 'O'}, {'Ọ', 'O'},
        {'Ô', 'O'}, {'Ố', 'O'}, {'Ồ', 'O'}, {'Ổ', 'O'}, {'Ỗ', 'O'}, {'Ộ', 'O'},
        {'Ơ', 'O'}, {'Ớ', 'O'}, {'Ờ', 'O'}, {'Ở', 'O'}, {'Ỡ', 'O'}, {'Ợ', 'O'},
        {'Ú', 'U'}, {'Ù', 'U'}, {'Ủ', 'U'}, {'Ũ', 'U'}, {'Ụ', 'U'},
        {'Ư', 'U'}, {'Ứ', 'U'}, {'Ừ', 'U'}, {'Ử', 'U'}, {'Ữ', 'U'}, {'Ự', 'U'},
        {'Ý', 'Y'}, {'Ỳ', 'Y'}, {'Ỷ', 'Y'}, {'Ỹ', 'Y'}, {'Ỵ', 'Y'},
        {'Đ', 'D'}
    };

    std::string output = "";
    for (char c : input) {
        if (charMap.find(c) != charMap.end()) {
            output += charMap[c];
        } else {
            output += c;
        }
    }

    return output;
}

int main() {
    std::string text = "Chữ có dấu: Việt Nam rất đẹp!";
    std::cout << "Original string: " << text << std::endl;
    std::cout << "Unaccented string: " << removeAccents(text) << std::endl;
    return 0;
}

Detailed explanation:

  • std::unordered_map<char, char>: Creates a mapping table from accented to unaccented characters.
  • Loop for (char c : input): Iterates through each character of the input string.
  • charMap.find(c): Checks if the character exists in the mapping table, then appends the corresponding unaccented character to the output.

System Requirements:

  • C++11 or later

How to install the libraries needed to run the C++ code above:

Use a compiler that supports C++11 or later, such as GCC or Visual Studio.

Tips:

  • For large Vietnamese texts, consider using optimized algorithms or libraries that handle Unicode conversion efficiently.
Tags: Unicode, C++


Related

Example of Object-Oriented Programming (OOP) in C++

This article provides an illustrative example of object-oriented programming (OOP) in C++, covering concepts such as classes, objects, inheritance, and polymorphism.
Common Functions When Using Selenium Chrome in C++

This article lists the common functions used when working with Selenium Chrome in C++, helping readers quickly grasp the necessary operations for browser automation.
How to append an Authentication Header Token when POSTing data to an API in C++

A guide on how to pass an authentication token via the Authentication Header when POSTing data to an API using C++. The example utilizes the `libcurl` library to perform HTTP requests with token-based authentication.
Create a Simple Chat Application Using Socket.IO in C++

A guide on how to create a simple chat application using C++ with Socket.IO, helping you to understand more about network programming and real-time communication.
Create a Thumbnail for Images in C++

A detailed guide on how to create a thumbnail for images in C++ using the OpenCV library. This article will help you understand how to process images and easily resize them to create thumbnails.
How to automatically log into a website using Selenium with Chrome in C++

A guide on using Selenium with ChromeDriver in C++ to automatically log into a website. The article explains how to set up Selenium and ChromeDriver and the steps to log in to a specific website.
Create a watermark for images using C++

A guide on how to create a watermark for images in C++ using the OpenCV library. This article helps you understand how to add text or images onto a photo to create a watermark.
Updating Data in MySQL Using C++

A guide on how to update data in MySQL using C++ with Prepared Statements, ensuring security and efficiency when interacting with the database. This article provides a clear illustrative example.
How to Write Data to an Excel File Using C++

A detailed guide on writing data to an Excel file using C++ and the openxlsx library. This article provides the necessary steps to create and write data to an Excel file easily.
How to POST data to an API using C++ with libcurl

A guide on how to send data to an API using the POST method in C++ with the libcurl library. This article will help you understand how to configure and send HTTP POST requests to a RESTful API.

main.add_cart_success