I've found that variant with
unpack('N', mb_convert_encoding($c, 'UCS-4BE', 'UTF-8'));
is VERY-VERY slow.
Remember this when process strings longer than 1K.
ord
(PHP 4, PHP 5)
ord — Return ASCII value of character
Description
int ord
( string
$string
)
Returns the ASCII value of the first character of
string.
This function complements chr().
Parameters
-
string -
A character.
Return Values
Returns the ASCII value as an integer.
Examples
Example #1 ord() example
<?php
$str = "\n";
if (ord($str) == 10) {
echo "The first character of \$str is a line feed.\n";
}
?>
znaeff at mail dot ru
05-Oct-2011 09:36
Anonymous
05-Apr-2011 05:49
i need put utf8 hungarian "abc" into html id attribute, but id not contain non-ascii chars (like á, ő, ű), and not to begin a number.
<?php
function utfCharToNumber($char) {
$i = 0;
$number = '';
while (isset($char{$i})) {
$number.= ord($char{$i});
++$i;
}
return $number;
}
// example use
foreach (array('a','A','b','B','c','C','e','é','É', 'ó','Ó','ö','Ö','ő','Ő','ú') as $d) {
echo $d,': ',utfCharToNumber($d),"\n";
}
?>
output:
a: 97
A: 65
b: 98
B: 66
c: 99
C: 67
e: 101
é: 195169
É: 195137
ó: 195179
Ó: 195147
ö: 195182
Ö: 195150
ő: 197145
Ő: 197144
ú: 195186
i generated the folowing ids:
"char-97", "char-65", "char-98" ...
manixrock(hat)gmail(doink)com
03-Jun-2010 08:14
For anyone having trouble trying to detect the encoding of a string because PHP provides no easy way to see the characters (and byte values) of a string, here's a function that returns the characters and byte values for the ASCII and UTF-8 encodings:
<?php
function hex_chars($data) {
$mb_chars = '';
$mb_hex = '';
for ($i=0; $i<mb_strlen($data, 'UTF-8'); $i++) {
$c = mb_substr($data, $i, 1, 'UTF-8');
$mb_chars .= '{'. ($c). '}';
$o = unpack('N', mb_convert_encoding($c, 'UCS-4BE', 'UTF-8'));
$mb_hex .= '{'. hex_format($o[1]). '}';
}
$chars = '';
$hex = '';
for ($i=0; $i<strlen($data); $i++) {
$c = substr($data, $i, 1);
$chars .= '{'. ($c). '}';
$hex .= '{'. hex_format(ord($c)). '}';
}
return array(
'data' => $data,
'chars' => $chars,
'hex' => $hex,
'mb_chars' => $mb_chars,
'mb_hex' => $mb_hex,
);
}
function hex_format($o) {
$h = strtoupper(dechex($o));
$len = strlen($h);
if ($len % 2 == 1)
$h = "0$h";
return $h;
}
?>
regalia at umail dot ucsb dot edu
26-Jan-2009 03:54
Make sure that the parameter you are passing to the ord function is a string.
<?php
$num = 12345;
// Incorrect usage of square bracket notation
if(ord($num[0]) == 0) {
echo "Not a valid ASCII character";
}
// Using the substr method will account for any data type
if(ord(substr($num,0,1)) == 0) {
echo "Not a valid ASCII character";
}
?>
darien at etelos dot com
19-Jan-2007 11:27
I found I wanted to sanitize a string for certain ASCII/ANSI characters, but to leave unicode alone. Since ord() breaks on processing unicode, I drew these two functions up to help with a saniziter which looked at ordinal values. (Finding "pack" and "unpack" was much better than my own powers-of-256 code.)
<?php
/*
By Darien Hager, Jan 2007... Use however you wish, but please
please give credit in source comments.
Change "UTF-8" to whichever encoding you are expecting to use.
*/
function ords_to_unistr($ords, $encoding = 'UTF-8'){
// Turns an array of ordinal values into a string of unicode characters
$str = '';
for($i = 0; $i < sizeof($ords); $i++){
// Pack this number into a 4-byte string
// (Or multiple one-byte strings, depending on context.)
$v = $ords[$i];
$str .= pack("N",$v);
}
$str = mb_convert_encoding($str,$encoding,"UCS-4BE");
return($str);
}
function unistr_to_ords($str, $encoding = 'UTF-8'){
// Turns a string of unicode characters into an array of ordinal values,
// Even if some of those characters are multibyte.
$str = mb_convert_encoding($str,"UCS-4BE",$encoding);
$ords = array();
// Visit each unicode character
for($i = 0; $i < mb_strlen($str,"UCS-4BE"); $i++){
// Now we have 4 bytes. Find their total
// numeric value.
$s2 = mb_substr($str,$i,1,"UCS-4BE");
$val = unpack("N",$s2);
$ords[] = $val[1];
}
return($ords);
}
?>
S.N.O.W.M.A.N.-X
28-Sep-2006 12:47
Well, i was thinking about a method to hash a string with md5 in a loose way, so md5("HELLO") isn't the same like md5("Hello"), even, i my case, it is about cd-titles i got submitted by users. So i made some function transforming my string to right what i want
Thisone is the "call" function returning the "loose hash".
It will get only the chars of a string, make them to uppercase and then hash with md5.
<?php
function loosehash($string){
return md5(strtoupper(onlyChars($string)));
}
?>
Thisone is moving through a string like a chararray and check for the asciivalues, you can edit the values and condition to fit your needs
<?php
function onlyChars($string){
$strlength = strlen($string);
$retString = "";
for($i = 0; $i < $strlength; $i++){
if((ord($string[$i]) >= 48 && ord($string[$i]) <= 57) ||
(ord($string[$i]) >= 65 && ord($string[$i]) <= 90) ||
(ord($string[$i]) >= 97 && ord($string[$i]) <= 122)){
$retString .= $string[$i];
}
}
echo $retString;
}
?>
phil (at) pchowtos (dot) co (dot) uk
18-Jul-2006 10:51
You can use the following function to generate a random string between the lengths of $x and $y...
<?php
$x = 1; //minimum length
$y = 10; //maximum length
$len = rand($x,$y); //get a random string length
for ($i = 0; $i < $len; $i++) { //loop $len no. of times
$whichChar = rand(1,3); //choose if its a caps, lcase or num
if ($whichChar == 1) { //it's a number
$string .= chr(rand(48,57)); //randomly generate a num
}
elseif ($whichChar == 2) { //it's a small letter
$string .= chr(rand(65,90)); //randomly generate an lcase
}
else { //it's a capital letter
$string .= chr(rand(97,122)); //randomly generate a ucase
}
}
echo $string; //echo out the generated string
?>
erdem at a1tradenetwork dot com
16-May-2006 11:16
I have a new characters table. i want send it below that.
<?php
$color = "#f1f1f1";
echo "<center>";
echo "<h1>From 32 To 255 Characters Table</h1>";
echo "</center>";
echo "<table border=\"0\" style=\"font-family:verdana;font-size:11px;\"".
" align=\"center\" width=\"800\"><tr style=\"font-weight:bold;\" ".
"bgcolor=\"#99cccc\">".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"<td width=\"15\">DEC</td><td width=\"15\">OCT</td>".
"<td width=\"15\">HEX</td><td width=\"15\">CHR</td>".
"</tr><tr>";
$ii = 0;
for ($i=32;$i<=255;$i++){
$char = chr($i);
$dec = ord($char);
if ($i == "32") {
$char = "Space";
}
echo "<td style=\"background-color:$color;width:15px;\">".
$dec."</td>\n<td style=\"background-color:$color;".
"width:15px;text-align:left;\">".decoct($dec)."</td>\n".
"<td style=\"background-color:$color;width:15px;".
"text-align:left;\">".dechex($dec)."</td>\n ".
"<td style=\"background-color:$color;width:15px;".
"text-align:left;\"><b>".$char."</b></td>\n ";
if (($ii % 6) == 5) {
echo "</tr>\n<tr>\n";
}
if (($ii % 2) == 1) {
$color = "#f1f1f1";
}else {
$color = "#ffffcc";
}
$ii++;
}
echo "</tr></table>";
?>
Matthew Flint
31-Oct-2005 02:59
I wrote the following function to clean illegal characters from input strings.
(Background: I have a php-based news website. People were writing articles in MS Word, then copy-and-pasting the text into the website. Word uses non-standard characters for opening and closing quotes and double-quotes, and for "..." - and this was resulting in articles on the website that failed XHTML validation)
<?php
function clean_string_input($input)
{
$interim = strip_tags($input);
if(get_magic_quotes_gpc())
{
$interim=stripslashes($interim);
}
// now check for pure ASCII input
// special characters that might appear here:
// 96: opening single quote (not strictly illegal, but substitute anyway)
// 145: opening single quote
// 146: closing single quote
// 147: opening double quote
// 148: closing double quote
// 133: ellipsis (...)
// 163: pound sign (this is safe, so no substitution required)
// these can be substituted for safe equivalents
$result = '';
for ($i=0; $i<strlen($interim); $i++)
{
$char = $interim{$i};
$asciivalue = ord($char);
if ($asciivalue == 96)
{
$result .= '\\'';
}
else if (($asciivalue > 31 && $asciivalue < 127) ||
($asciivalue == 163) || // pound sign
($asciivalue == 10) || // lf
($asciivalue == 13)) // cr
{
// it's already safe ASCII
$result .= $char;
}
else if ($asciivalue == 145) // opening single quote
{
$result .= '\\'';
}
else if ($asciivalue == 146) // closing single quote
{
$result .= '\\'';
}
else if ($asciivalue == 147) // opening double quote
{
$result .= '"';
}
else if ($asciivalue == 148) // closing double quote
{
$result .= '"';
}
else if ($asciivalue == 133) // ellipsis
{
$result .= '...';
}
}
return $result;
}
?>
28-Feb-2005 06:12
Function using ord() to strip out garbage characters and punctuation from a string. This is handy when trying to be smart about what an input is "trying" to be..
<?php
function cleanstr($string){
$len = strlen($string);
for($a=0; $a<$len; $a++){
$p = ord($string[$a]);
# chr(32) is space, it is preserved..
(($p > 64 && $p < 123) || $p == 32) ? $ret .= $string[$a] : $ret .= "";
}
return $ret;
}
?>
jacobfri at skydebanen dot net
04-Jun-2004 05:10
Just to get things straight about which character table ord() and chr() uses.
The range 128-255 is _not_ equivalent with the widely used extended ASCII-table, like the one described in www.asciitable.com. The actual equivalent is the 128-255 range of Unicode.
That's a good thing, because then ord() and chr() is compatible with javascript, and any other language that uses Unicode.
But it's rather nice to know it, and the description of ord() is kind of misleading, when it only refers to www.asciitable.com.
v0rbiz at yahoo dot com
28-May-2004 04:15
I did not found a unicode/multibyte capable 'ord' function, so...
<?php
function uniord($u) {
$k = mb_convert_encoding($u, 'UCS-2LE', 'UTF-8');
$k1 = ord(substr($k, 0, 1));
$k2 = ord(substr($k, 1, 1));
return $k2 * 256 + $k1;
}
?>
arjini at mac dot com
18-Mar-2004 11:49
If you're looking to provide bare bones protection to email addresses posted to the web try this:
<?php
$string = 'arjini@mac.com';
for($i=0;$i<strlen($string);++$i){
$n = rand(0,1);
if($n)
$finished.='&#x'.sprintf("%X",ord($string{$i})).';';
else
$finished.='&#'.ord($string{$i}).';';
}
echo $finished;
?>
This randomly encodes a mix of hex and oridinary HTML entities for every character in the address. Note that a decoding mechanism for this could probably be written just as easily, so eventually the bots will be able to cut through this like butter, but for now, it seems like most harvesters are only looking for non-hex html entities.
