When running Java or other types of applications which use an XML file as part of the input, you may see this error: The Processing Instruction Target Matching “[xX][mM][lL]” is Not Allowed.

Often the cause is a malformed XML file, with a common problem being a blank line accidentally inserted before the required first line of the xml:

[ A blank line here will cause the error ]
<?xml version-"1.0"?>

While installing the search engine “Nutch,” I got this error as shown:

ellensmac:- ellen$ /Users/ellen/Sites/apache-nutch-1.1/bin/nutch inject crawl/crawldb urls
[Fatal Error] nutch-site.xml:7:6: The processing instruction target matching "[xX][mM][lL]" is not allowed.
Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException: The processing instruction target matching "[xX][mM][lL]" is not allowed.
at org.apache.hadoop.conf.Configuration.loadResource (Configuration.java:1168 )
at org.apache.hadoop.conf.Configuration.loadResources (Configuration.java:1040 )
at org.apache.hadoop.conf.Configuration.getProps (Configuration.java:980 )
at org.apache.hadoop.conf.Configuration.set (Configuration.java:405)
at org.apache.hadoop.conf.Configuration.setBoolean (Configuration.java:585)
at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions (GenericOptionsParser.java:290)
at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions (GenericOptionsParser.java:375)
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153 )
at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:138 )
at org.apache.hadoop.util.ToolRunner.run (ToolRunner.java:59 )
at org.apache.nutch.crawl.injector.main (Injector.java:231 )
Caused by: org.xml.sax.SAXParseException: The processing instruction target matching" [xX ] [mM] [lL ]" is not allowed .

The first line of my file appeared fine, and I didn’t immediately see a problem with the XML, so I turned to Firefox. If your XML file appears fine at first glance, a good way to spot any errors is to validate the file by viewing it in Firefox. Firefox will check the file and point out any errors.

Firefox showed that I had accidentally inserted extra text (highlighted in pink) at lines 6 and 7. Line 7 is the start of an extra XML declaration, which should only be at the start of the document. When the extra lines were removed, the command ran without error.