What regex can match similar characters? [duplicate]2019 Community Moderator ElectionConverting Symbols, Accent Letters to English AlphabetWhat is reflection and why is it useful?What is the difference between public, protected, package-private and private in Java?What is a serialVersionUID and why should I use it?Regular expression to match a line that doesn't contain a word?Why is the Android emulator so slow? How can we speed up the Android emulator?RegEx match open tags except XHTML self-contained tagsWhat is the difference between “px”, “dip”, “dp” and “sp”?What is a non-capturing group? What does (?:) do?What is 'Context' on Android?How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops

Recruiter wants very extensive technical details about all of my previous work

How to explain that I do not want to visit a country due to personal safety concern?

How to change two letters closest to a string and one letter immediately after a string using notepad++

how to draw discrete time diagram in tikz

What approach do we need to follow for projects without a test environment?

How to deal with a cynical class?

How could a scammer know the apps on my phone / iTunes account?

What are substitutions for coconut in curry?

Is it true that good novels will automatically sell themselves on Amazon (and so on) and there is no need for one to waste time promoting?

How to use of "the" before known matrices

Is it possible to upcast ritual spells?

Why did it take so long to abandon sail after steamships were demonstrated?

Are all passive ability checks floors for active ability checks?

Employee lack of ownership

What exactly is this small puffer fish doing and how did it manage to accomplish such a feat?

Is there a data structure that only stores hash codes and not the actual objects?

What's the meaning of “spike” in the context of “adrenaline spike”?

How to create the Curved texte?

What did Alexander Pope mean by "Expletives their feeble Aid do join"?

If curse and magic is two sides of the same coin, why the former is forbidden?

Instead of Universal Basic Income, why not Universal Basic NEEDS?

It's a yearly task, alright

How can I track script which gives me "command not found" right after the login?

Interplanetary conflict, some disease destroys the ability to understand or appreciate music



What regex can match similar characters? [duplicate]



2019 Community Moderator ElectionConverting Symbols, Accent Letters to English AlphabetWhat is reflection and why is it useful?What is the difference between public, protected, package-private and private in Java?What is a serialVersionUID and why should I use it?Regular expression to match a line that doesn't contain a word?Why is the Android emulator so slow? How can we speed up the Android emulator?RegEx match open tags except XHTML self-contained tagsWhat is the difference between “px”, “dip”, “dp” and “sp”?What is a non-capturing group? What does (?:) do?What is 'Context' on Android?How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops










1
















This question already has an answer here:



  • Converting Symbols, Accent Letters to English Alphabet

    12 answers



What regex could match similar characters, like (ä and a) or in Russian (и and й)?
Below my code...



Sting text1 = " Passagiere noch auf ihr fehlendes Gepäck"
Sting text2 = " Passagiere noch auf ihr fehlendes Gepack"

Pattern p1 = Pattern.compile("\b" + "Gepack");
Pattern p2 = Pattern.compile("\b" + "Gepack");

Matcher m1 = p1.matcher(text1); // doesn't find any occurrence
Matcher m2 = p2.matcher(text2) // founds one occurrence









share|improve this question















marked as duplicate by Wiktor Stribiżew java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Mar 7 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


















  • Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

    – JGNI
    Mar 7 at 14:41















1
















This question already has an answer here:



  • Converting Symbols, Accent Letters to English Alphabet

    12 answers



What regex could match similar characters, like (ä and a) or in Russian (и and й)?
Below my code...



Sting text1 = " Passagiere noch auf ihr fehlendes Gepäck"
Sting text2 = " Passagiere noch auf ihr fehlendes Gepack"

Pattern p1 = Pattern.compile("\b" + "Gepack");
Pattern p2 = Pattern.compile("\b" + "Gepack");

Matcher m1 = p1.matcher(text1); // doesn't find any occurrence
Matcher m2 = p2.matcher(text2) // founds one occurrence









share|improve this question















marked as duplicate by Wiktor Stribiżew java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Mar 7 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.


















  • Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

    – JGNI
    Mar 7 at 14:41













1












1








1









This question already has an answer here:



  • Converting Symbols, Accent Letters to English Alphabet

    12 answers



What regex could match similar characters, like (ä and a) or in Russian (и and й)?
Below my code...



Sting text1 = " Passagiere noch auf ihr fehlendes Gepäck"
Sting text2 = " Passagiere noch auf ihr fehlendes Gepack"

Pattern p1 = Pattern.compile("\b" + "Gepack");
Pattern p2 = Pattern.compile("\b" + "Gepack");

Matcher m1 = p1.matcher(text1); // doesn't find any occurrence
Matcher m2 = p2.matcher(text2) // founds one occurrence









share|improve this question

















This question already has an answer here:



  • Converting Symbols, Accent Letters to English Alphabet

    12 answers



What regex could match similar characters, like (ä and a) or in Russian (и and й)?
Below my code...



Sting text1 = " Passagiere noch auf ihr fehlendes Gepäck"
Sting text2 = " Passagiere noch auf ihr fehlendes Gepack"

Pattern p1 = Pattern.compile("\b" + "Gepack");
Pattern p2 = Pattern.compile("\b" + "Gepack");

Matcher m1 = p1.matcher(text1); // doesn't find any occurrence
Matcher m2 = p2.matcher(text2) // founds one occurrence




This question already has an answer here:



  • Converting Symbols, Accent Letters to English Alphabet

    12 answers







java android regex pattern-matching






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 7 at 14:01







hamid

















asked Mar 7 at 14:00









hamidhamid

2719




2719




marked as duplicate by Wiktor Stribiżew java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Mar 7 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.









marked as duplicate by Wiktor Stribiżew java
Users with the  java badge can single-handedly close java questions as duplicates and reopen them as needed.

StackExchange.ready(function()
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function()
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function()
$hover.showInfoMessage('',
messageElement: $msg.clone().show(),
transient: false,
position: my: 'bottom left', at: 'top center', offsetTop: -7 ,
dismissable: false,
relativeToBody: true
);
,
function()
StackExchange.helpers.removeMessages();

);
);
);
Mar 7 at 14:34


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

    – JGNI
    Mar 7 at 14:41

















  • Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

    – JGNI
    Mar 7 at 14:41
















Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

– JGNI
Mar 7 at 14:41





Not sure this is the right duplicate as the linked to article is more about transliteration than normalisation.

– JGNI
Mar 7 at 14:41












1 Answer
1






active

oldest

votes


















1














You could build up a character class of all the characters you want to match so you could replace pattern one with



Pattern p1 = Pattern.compile("\b" + "Gep[aä]ck");


But this could get very burdensome very quickly



There is a mechanism in Unicode called Normalisation, see here for details, that lets you reformat your string to compare in different ways.



Normalisation Form Canonical Decomposition (NFD) takes a string containing accented character code points and creates multiple code points, starting with the base character and then with code points cosponsoring to combining character versions of the accents in a well defined order for each accented character.



Having done this to your input you can use a regex to remove all the accents from the string as they will all have the Unicode property Mark, sometimes shortened to M.



This gives you a string containing only base characters that your regex will match against.






share|improve this answer





























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    You could build up a character class of all the characters you want to match so you could replace pattern one with



    Pattern p1 = Pattern.compile("\b" + "Gep[aä]ck");


    But this could get very burdensome very quickly



    There is a mechanism in Unicode called Normalisation, see here for details, that lets you reformat your string to compare in different ways.



    Normalisation Form Canonical Decomposition (NFD) takes a string containing accented character code points and creates multiple code points, starting with the base character and then with code points cosponsoring to combining character versions of the accents in a well defined order for each accented character.



    Having done this to your input you can use a regex to remove all the accents from the string as they will all have the Unicode property Mark, sometimes shortened to M.



    This gives you a string containing only base characters that your regex will match against.






    share|improve this answer



























      1














      You could build up a character class of all the characters you want to match so you could replace pattern one with



      Pattern p1 = Pattern.compile("\b" + "Gep[aä]ck");


      But this could get very burdensome very quickly



      There is a mechanism in Unicode called Normalisation, see here for details, that lets you reformat your string to compare in different ways.



      Normalisation Form Canonical Decomposition (NFD) takes a string containing accented character code points and creates multiple code points, starting with the base character and then with code points cosponsoring to combining character versions of the accents in a well defined order for each accented character.



      Having done this to your input you can use a regex to remove all the accents from the string as they will all have the Unicode property Mark, sometimes shortened to M.



      This gives you a string containing only base characters that your regex will match against.






      share|improve this answer

























        1












        1








        1







        You could build up a character class of all the characters you want to match so you could replace pattern one with



        Pattern p1 = Pattern.compile("\b" + "Gep[aä]ck");


        But this could get very burdensome very quickly



        There is a mechanism in Unicode called Normalisation, see here for details, that lets you reformat your string to compare in different ways.



        Normalisation Form Canonical Decomposition (NFD) takes a string containing accented character code points and creates multiple code points, starting with the base character and then with code points cosponsoring to combining character versions of the accents in a well defined order for each accented character.



        Having done this to your input you can use a regex to remove all the accents from the string as they will all have the Unicode property Mark, sometimes shortened to M.



        This gives you a string containing only base characters that your regex will match against.






        share|improve this answer













        You could build up a character class of all the characters you want to match so you could replace pattern one with



        Pattern p1 = Pattern.compile("\b" + "Gep[aä]ck");


        But this could get very burdensome very quickly



        There is a mechanism in Unicode called Normalisation, see here for details, that lets you reformat your string to compare in different ways.



        Normalisation Form Canonical Decomposition (NFD) takes a string containing accented character code points and creates multiple code points, starting with the base character and then with code points cosponsoring to combining character versions of the accents in a well defined order for each accented character.



        Having done this to your input you can use a regex to remove all the accents from the string as they will all have the Unicode property Mark, sometimes shortened to M.



        This gives you a string containing only base characters that your regex will match against.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 7 at 14:28









        JGNIJGNI

        2,541718




        2,541718















            Popular posts from this blog

            How to get text form Clipboard with JavaScript in Firefox 56?How to validate an email address in JavaScript?How do JavaScript closures work?How do I remove a property from a JavaScript object?How do you get a timestamp in JavaScript?How do I copy to the clipboard in JavaScript?How do I include a JavaScript file in another JavaScript file?Get the current URL with JavaScript?How to replace all occurrences of a string in JavaScriptHow to check whether a string contains a substring in JavaScript?How do I remove a particular element from an array in JavaScript?

            Can't initialize raids on a new ASUS Prime B360M-A motherboard2019 Community Moderator ElectionSimilar to RAID config yet more like mirroring solution?Can't get motherboard serial numberWhy does the BIOS entry point start with a WBINVD instruction?UEFI performance Asus Maximus V Extreme

            List of MPs elected to the English parliament in 1640 (April) Contents List of constituencies and members See also Notes References Navigation menueNational Archives – The Glynde Place ArchivesCobbett's Parliamentary history of England, from the Norman Conquest in 1066 to the year 1803'Aldermen in Parliament', The Aldermen of the City of London: Temp. Henry III – 1912onepage&q&f&#61, false 229