Invalid characters in XML tag name

ThW 2020-01-31 20:33

@Quentin suggest the better way. Using dynamic node names mean that you can not define an XSD/Schema, your XML files will be wellformed only. You will not be able to make full use of validators. So a <field name="..."/> is a better solution from a machine readability and maintenance point of view.

However, NCNames (non-colonized names) allow for quite a lot characters. Here is what I implemented in my library for converting JSON.

$nameStartChar defines letters and several Unicode ranges. $nameChar adds some more characters to that definition (like the digits).

The first RegExp removes any character that is NOT a name char. The second removes any starting character that is NOT defined in $nameStartChar. If the result is empty it will return a default name.

function normalizeString(string $string, string $default = '_'): string {
    $nameStartChar =
      'A-Z_a-z'.
      '\\x{C0}-\\x{D6}\\x{D8}-\\x{F6}\\x{F8}-\\x{2FF}\\x{370}-\\x{37D}'.
      '\\x{37F}-\\x{1FFF}\\x{200C}-\\x{200D}\\x{2070}-\\x{218F}'.
      '\\x{2C00}-\\x{2FEF}\\x{3001}-\\x{D7FF}\\x{F900}-\\x{FDCF}'.
      '\\x{FDF0}-\\x{FFFD}\\x{10000}-\\x{EFFFF}';
    $nameChar =
      $nameStartChar.
      '\\.\\d\\x{B7}\\x{300}-\\x{36F}\\x{203F}-\\x{2040}';
    $result = \preg_replace(
      [
        '([^'.$nameChar.'-]+)u',
        '(^[^'.$nameStartChar.']+)u',
      ],
      '',
      $string
    );
    return empty($result) ? $default : $result;
}

An qualified XML node name can consist of two NC names separated by ':'. The first part would be the namespace prefix.

$examples = [
  '123foo', 
  'foo123', 
  '  foo  ', 
  '  ', 
  'foo:bar', 
  'foo-bar'
];

foreach ($examples as $example) {
    var_dump(normalizeString($example));
}

Output:

string(3) "foo"
string(6) "foo123"
string(3) "foo"
string(1) "_"
string(6) "foobar"
string(7) "foo-bar"

Related issues

Update sql database by clicking checkbox without submit button using ajax

How to attach when enter a text in laravel?

Directus calculated field / column

How to count messages for a user with php and mysql

How to Remove Array Element and Then Re-Index Array?

laravel and multi-sessions from the same browser

Calculate working days between to dates

How to include a PHP variable inside a MySQL statement

is it possible to create SQL table and insert data at the same time?

What does "zend_mm_heap corrupted" mean