JavaScript character utility CharFunk 1.1.0 released

CharFunk is a little library I wrote a few years ago to make it easier to do things with Unicode text. I revisited it recently to clean up and improve the code, and add tests and a few features. The API is pretty simple:

  • CharFunk.getDirectionality(ch) – Used to find the directionality of the character
  • CharFunk.getMatches(string,callback) – Returns an array of contiguous matching strings for which the callback returns true, similar to String.match()
  • CharFunk.isAllLettersOrDigits(string) – Returns true if the string argument is composed of all letters and digits
  • CharFunk.isDigit(ch) – Returns true if provided a length 1 string that is a digit
  • CharFunk.isLetter(ch) – Returns true if provided a length 1 string that is a letter
  • CharFunk.isLetterNumber(ch) – Returns true if provided a length 1 string that is in the Unicode “Nl” category
  • CharFunk.isLetterOrDigit(ch) – Returns true if provided a length 1 string that is a letter or a digit
  • CharFunk.isLowerCase(ch) – Returns true if provided a length 1 string that is lowercase
  • CharFunk.isMirrored(ch) – Returns true if provided a length 1 string that is a mirrored character
  • CharFunk.isUpperCase(ch) – Returns true if provided a length 1 string that is uppercase
  • CharFunk.isValidFirstForName(ch) – Returns true if provided a length 1 string that is a valid leading character for a JavaScript identifier
  • CharFunk.isValidMidForName(ch) – Returns true if provided a length 1 string that is a valid non-leading character for a ECMAScript identifier
  • CharFunk.isValidName(string,checkReserved) – Returns true if the string is a valid ECMAScript identifier
  • CharFunk.isWhitespace(ch) – Returns true if provided a length 1 string that is a whitespace character
  • CharFunk.indexOf(ch) – Returns the first index where the character causes a true return from the callback, or -1 if no match
  • CharFunk.lastIndexOf(ch) – Returns the last index where the character causes a true return from the callback, or -1 if no match
  • CharFunk.matchesAll(string,callback) – Returns true if all characters in the provided string result in a true return from the callback
  • CharFunk.replaceMatches(string,callback,ch) – Returns a new string with all matched characters replaced, similar to String.replace()
  • CharFunk.splitOnMatches(string,callback) – Splits the string on all matches, similar to String.split()

This allows you to do some things you would have a hard time doing in JavaScript otherwise. JavaScript RegExps are notoriously useless for dealing with non-ASCII data. For example, imagine you wanted to do something simple like replace all non-word characters with an underscore. This is easy:

"The United States of America".replace(/[^\w]/g,"_");
    //returns "The_United_States_of_America"

Unless of course, you are dealing with non-ASCII letters:

"Российская Федерация".replace(/[^\w]/g,"_"); 
    //returns "___________________" 
"جمهورية مصر العربية".replace(/[^\w]/g,"_"); 
   //returns "____________________"

That’s not what we want.

Fortunately, CharFunk can handle this using replaceMatches:

function notLetterOrDigit(ch) {
    return !CharFunk.isLetterOrDigit(ch);
}

CharFunk.replaceMatches("جمهورية مصر العربية",notLetterOrDigit,"_"); 
    // returns "جمهورية_مصر_العربية"

CharFunk.replaceMatches("Российская Федерация",notLetterOrDigit,"_"); 
   //returns "Российская_Федерация"

This is just one small example of what CharFunk can do. I hope that web developers working on international projects — which is pretty much any web app these days — will find this useful!

One Response to JavaScript character utility CharFunk 1.1.0 released

  1. Using Ancient Rome 3D in Google Earth, you can explore Rome as it appeared in 320
    A. These pre-computed numbers, hold on in a very giant information bank for
    millions or URLs on the net. Besides placing advertisers ads on your Blog, you
    can also make money Blogging by placing Google Adsense into your
    Blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>