I’ve been working on a make script that extracts the files it needs to build from a Microsoft Visual Studio project file. The .vcxproj file format is XML so I imagined it would be easy to use a command line XML processor to do the job. However because the project file declares a default namespace this was not as easy as it first looked.
One tool that can perform XPath queries on XML files is xmllint. This is part of the libxml package and is available in most Linux distributions and also on Cygwin. As this script was for use with Window 7, it seemed like a good choice. The method for running an XPath query using xmllint is the undocumented command line option –xpath:
$ xmllint --xpath /XmlTag1/XmlTag2/etc... XmlFile.xml
This works fine on simple sample XML files such as the standard books.xml. However, running this against a .vcxproj file produces:
$ xmllint --xpath "/Project/ItemGroup/ClCompile" project.vcxproj
XPath set is empty
And if that fails, then the actual query that I want to run causes an error:
$ xmllint --xpath "/Project/ItemGroup/ClCompile/@Include" project.vcxproj
Segmentation fault (core dumped)
The reason this doesn’t work is down to the project file’s use of the default namespace. The .vcxproj file defines its default namespace as:
http://schemas.microsoft.com/developer/msbuild/2003
This means that all of the XML tags in the .vcxproj file belong to this namespace. But there is no way in an XPath query to specify what the default namespace is. You need to have mapped the namespace to an identifier within the tool first.
xmllint can do this remapping in its shell mode. But there is no current method to specify the default namespace through a command line option. This means the above XPath query would need to be as follows:
$ xmllint --xpath "/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='Project']/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='ItemGroup']/*[namespace-uri()='http://schemas.microsoft.com/developer/msbuild/2003' and local-name()='ClCompile']" project.vcxproj
Within the make script itself, I can use variables or macro expansion to make the command a bit more manageable. A more hackier alternative would be to strip out the default namespace declaration from the .vcxproj file. This can be done with the sed program:
$ sed -e "s/xmlns/ignore/" project.vcxproj | xmllint --xpath "/Project/ItemGroup/ClCompile/@Include" -
Another shorter hack would be to relax the matching criteria:
$ xmllint --xpath "//*[local-name()='ClCompile']/@Include" project.vcxproj
But however I implement it, the results need further parsing as they come back in the form of key=”value” attributes:
Include="file1.cpp" Include="file2.cpp"
I can use another sed expression to turn that into a simple white-space separated list. But I’ll leave that for another post.
8 replies on “XPath and the default namespace”
Try xmllint –xpath “string (//*[local-name()=’ClCompile’]/@Include)” project.vcxproj
Sorry, should be:
setns defns=http://schemas.microsoft.com/developer/msbuild/2003
A possible solution is to use the shell mode of xmllint and a command file that is used as stdin:
xmllint –shell project.vcxproj < commands
Content of commands file:
setrootns
xpath /defaultns:Project/defaultns:ItemGroup/defaultns:ClCompile
You could also define a shorter namespace alias:
setns defns
xpath /defns:Project/defns:ItemGroup/defns:ClCompile
Of course, the output of the command has to be parsed but it seems to be more simple than parsing the output of
xmllint –xpath …
Hi Abika, I haven’t done any work parsing vcxproj files or XML files for a long time so I’m probably not the best person to ask. In general to access the n-th child of something you usually use n as an index: e.g. element[n] so I’d start my search there.
Hi,
How to get the nth child in the case with the namespace? how to use position in this type of queries?
You can extract only the attribute values like so:
xmllint –xpath “//*[local-name()=’ClCompile’]/@Include/text()”
Thanks Emil, I’ll incorporate your correction.
Actually to get any result you need local-name()=’ElementName’ instead of name()=’ElementName’