Wiki

Clone wiki

css2xpath# / Home

!! Deprecated !!

Please note that as of October 30, 2013 this project is deprecated and will likely not be supported in the future. Please see my new project, css2xpath Reloaded for a replacement.



Introduction

Note: This project was called css2xpath-csharp on Google Code. During the move to BitBucket, I decided to change the name to css2xpath#. The actual C# class that you will use is still css2xpath, however.

The original open-source css2xpath (by Andrea Giammarchi) is a JavaScript function that converts CSS selectors to xpath selectors. This project, css2xpath#, is a C# port of the original.

License

This project is under an MIT license.

Features

  • Client-side transformation of CSS selectors to XPath selectors.
  • Ability to define your own custom rules for transforming CSS to XPath through the public AddRule method.
  • Ability to preload the transformation rules, through the PreloadRules method.
  • Unit test project (UnitTest) included.
  • Self-documenting code.

Basic usage

Be sure to add: using MostThingsWeb; to your project. That's the namespace that contains this class.

Here is an example

String css = "div#test .note span:first-child";

String xpath = css2xpath.Transform(css);

// 'xpath' will contain:
// //div[@id='test']//*[contains(concat(' ',normalize-space(@class),' '),' note ')]*[1]/self::span

Advanced usage

Rules

css2xpath# transforms CSS selectors to XPath selectors using rules. I can't take any credit for them, as they were taken verbatim from the original version. But, with my port, you can add your own rules. Let's say there is some CSS feature that isn't implemented yet- you can implement it using the information and instructions below.

Rule structure

A rule has two parts: the Regex that finds a part of the CSS you want to transform, and the replacement string or MatchEvaluator that does the actual transformation. Here is an example of a rule that already exists:

Regex: \s*,\s*

Replacement: |

This rule is used to change commas to pipe characters (|). In CSS, a comma separates a clause in a rule. In XPath, the pipe character is used for the same task.

Most rules aren't as simple as this, however. For instance, that last example didn't preserve any of the rest of the match. It just searched for a single character, and replaced it. Here is a more complex rule:

Regex: "[([^\]~\$\*\^\|\!]+)(=[^\]]+)?\]

Replacement: [@$1$2]

This rule is used to convert CSS style attribute selectors into their XPath equivalents. In the replacement string, $1 refers to the name of the attribute. $2 is the value of the attribute, including the equals sign in front, if present.

Adding new rules

To add your rule to the rule collection, use the static AddRule method. For example:

// Replace commas with pipe characters, for separating queries.
css2xpath.AddRule(new Regex(@"\s*,\s*"), "|");


MatchEvaluator-based Replacements

All of the rules in the previous sections have had strings as the replacements. However, css2xpath# has several built-in rules that do not use string replacements. Replacements that involve embedded logic are better implemented by a MatchEvaluator than a complex Regex. Here is an example:

Regex: \[([a-zA-Z0-9_\-]+)\$=([^\]]+)\]

Replacement:

new MatchEvaluator((Match m) => {
    String a = m.Groups[1].Value;
    String b = m.Groups[2].Value;
    return "[substring(@" + a + ",string-length(@" + a + ")-" + (b.Length - 3) + ")=" + b + "]";
})

The replacement above uses an anonymous method to perform the transformation. This specific rule handles $= in attribute selectors. You can see that it plucks out the first and second capturing group (remember: m.Groups[0] references the entire match!) and uses them in the result. One more thing to note is that the only argument to the anonymous method is the match itself.

Adding new rules with MatchEvaluator-based replacements

To add a rule where the replacement is a MatchEvaluator, follow this example, taken from the previous section:

// Handles $= in attribute selectors
css2xpath.AddRule(new Regex(@"\[([a-zA-Z0-9_\-]+)\$=([^\]]+)\]"), new MatchEvaluator((Match m) => {
    String a = m.Groups[1].Value;
    String b = m.Groups[2].Value;
    return "[substring(@" + a + ",string-length(@" + a + ")-" + (b.Length - 3) + ")=" + b + "]";
}));


Preloading rules

css2xpath# uses a static constructor to define the rules. What this means is that the first time you call Transform or use any other method, the static constructor is automatically invoked. This is fine in most cases. However, this means that the first time you use Transform (or another method), each Regex for each rule has to be compiled at runtime. The next time you call a method like Transform, your result might come a little faster.

If you want to control when the rules are compiled, then use the PreloadRules method:

/// ... sometime maybe during application initializing:
css2xpath.PreloadRules();

PreloadRules is just an empty method, but calling it will force the static constructor to be executed, thus compiling the rules.

Updated