How the EXTRACTVALUE() function works in Mariadb?

The EXTRACTVALUE() function is an XML function that extracts a value from an XML document using an XPath expression.

Posted on

The EXTRACTVALUE() function is an XML function that extracts a value from an XML document using an XPath expression. XPath is a language that can navigate and query XML documents. The EXTRACTVALUE() function can be used to parse and extract data from XML documents stored in a table or a variable.

Syntax

The syntax of the EXTRACTVALUE() function is as follows:

EXTRACTVALUE(xml, xpath)

The function takes two arguments: the xml and the xpath. The xml is a string value that contains a valid XML document. The xpath is a string value that contains a valid XPath expression. The function returns a string value that represents the extracted value from the XML document. If the xml or the xpath is invalid, the function returns NULL.

The xpath expression can use any of the XPath 1.0 syntax and functions, except for those that return a node set. The function only returns the text value of the first node that matches the xpath expression. If the xpath expression matches more than one node, the function returns a warning. If the xpath expression does not match any node, the function returns an empty string.

Examples

Example 1: Extract the title from an XML document

In this example, we use the EXTRACTVALUE() function to extract the title from an XML document that contains a book information. We use the @ symbol to indicate an attribute in the xpath expression.

SET @xml = '<book id="1">
  <title>Database Systems</title>
  <author>Thomas Connolly</author>
  <price>59.99</price>
</book>';

SELECT EXTRACTVALUE(@xml, '/book/@id') AS Book_ID,
       EXTRACTVALUE(@xml, '/book/title') AS Book_Title;

The output is:

+----------+------------------+
| Book_ID  | Book_Title       |
+----------+------------------+
| 1        | Database Systems |
+----------+------------------+

The function returns the book id and the book title from the XML document.

Example 2: Extract the names from an XML document

In this example, we use the EXTRACTVALUE() function to extract the names from an XML document that contains a list of students. We use the // symbol to indicate any descendant node in the xpath expression.

SET @xml = '<students>
  <student>
    <name>Alice</name>
    <age>20</age>
    <major>Computer Science</major>
  </student>
  <student>
    <name>Bob</name>
    <age>21</age>
    <major>Mathematics</major>
  </student>
  <student>
    <name>Charlie</name>
    <age>19</age>
    <major>Physics</major>
  </student>
</students>';

SELECT EXTRACTVALUE(@xml, '//name') AS Student_Name;

The output is:

+--------------+
| Student_Name |
+--------------+
| Alice        |
+--------------+
1 row in set, 1 warning (0.00 sec)

The function returns the name of the first student that matches the xpath expression, which is Alice. However, the function also returns a warning, because the xpath expression matches more than one node. The warning message is:

Warning | 1265 | Data truncated for column 'Student_Name' at row 1

This means that the function only returns the first matching node, and ignores the rest.

Example 3: Extract the price from an XML document

In this example, we use the EXTRACTVALUE() function to extract the price from an XML document that contains a product information. We use the * symbol to indicate any child node in the xpath expression.

SET @xml = '<product>
  <name>iPhone 12</name>
  <category>Smartphone</category>
  <price currency="USD">799</price>
</product>';

SELECT EXTRACTVALUE(@xml, '/product/price') AS Price,
       EXTRACTVALUE(@xml, '/product/price/@currency') AS Currency;

The output is:

+-------+----------+
| Price | Currency |
+-------+----------+
| 799   | USD      |
+-------+----------+

The function returns the price and the currency from the XML document.

Example 4: Extract the text from an XML document

In this example, we use the EXTRACTVALUE() function to extract the text from an XML document that contains a comment. We use the text() function to indicate the text node in the xpath expression.

SET @xml = '<comment>
  This is a <b>bold</b> comment.
</comment>';

SELECT EXTRACTVALUE(@xml, '/comment/text()') AS Comment_Text;

The output is:

+-----------------+
| Comment_Text    |
+-----------------+
| This is a       |
+-----------------+

The function returns the text from the XML document, but only the text before the <b> tag. The text after the <b> tag is ignored, because it is not part of the first text node.

Example 5: Extract the value from an invalid XML document

In this example, we use the EXTRACTVALUE() function to extract the value from an invalid XML document that contains a malformed tag. We use a valid xpath expression, but the function returns NULL, because the xml argument is invalid.

SET @xml = '<product>
  <name>iPhone 12</name>
  <category>Smartphone</category>
  <price currency="USD">799</price
</product>';

SELECT EXTRACTVALUE(@xml, '/product/price') AS Price;

The output is:

+-------+
| Price |
+-------+
| NULL  |
+-------+

The function returns NULL, because the xml argument is invalid. The <price> tag is not closed properly.

There are some other XML functions that are related to the EXTRACTVALUE() function. Here are some of them:

  • UPDATEXML(xml, xpath, new_xml): This function updates a part of an XML document using an XPath expression and a new XML value. The function returns a string value that represents the updated XML document. If the xml, the xpath, or the new_xml is invalid, the function returns NULL. For example:

    SET @xml = '<product>
      <name>iPhone 12</name>
      <category>Smartphone</category>
      <price currency="USD">799</price>
    </product>';
    
    SELECT UPDATEXML(@xml, '/product/price', '<price currency="EUR">699</price>') AS Updated_XML;
    

    The output is:

    +----------------------------------------------------+
    | Updated_XML                                        |
    +----------------------------------------------------+
    | <product><name>iPhone 12</name><category>Smartphone|
    | </category><price currency="EUR">699</price></product>|
    +----------------------------------------------------+

    The function updates the price and the currency of the product in the XML document.

  • XMLQUERY(xpath, xml1, xml2, ...): This function evaluates an XPath expression against one or more XML documents and returns a node set. The function returns a string value that represents the node set in XML format. If the xpath or any of the xml arguments is invalid, the function returns NULL. For example:

    SET @xml1 = '<book id="1">
      <title>Database Systems</title>
      <author>Thomas Connolly</author>
      <price>59.99</price>
    </book>';
    
    SET @xml2 = '<book id="2">
      <title>Database Design</title>
      <author>Mark Gillenson</author>
      <price>49.99</price>
    </book>';
    
    SELECT XMLQUERY('/book/title' PASSING BY REF @xml1, @xml2) AS Book_Titles;
    

    The output is:

    +--------------------------------+
    | Book_Titles                    |
    +--------------------------------+
    | <title>Database Systems</title>|
    | <title>Database Design</title> |
    +--------------------------------+

    The function evaluates the xpath expression against the two XML documents and returns the titles of the books.

  • XMLTABLE(xpath, xml, columns): This function converts an XML document into a relational table using an XPath expression and a column definition. The function returns a table that contains the specified columns and the values extracted from the XML document. If the xpath, the xml, or the columns is invalid, the function returns an empty table.

Conclusion

The EXTRACTVALUE() function is a useful XML function that can extract a value from an XML document using an XPath expression. The function takes two arguments: the xml and the xpath. The xml is a string value that contains a valid XML document. The xpath is a string value that contains a valid XPath expression. The function returns a string value that represents the extracted value from the XML document, or NULL if the xml or the xpath is invalid. The function only returns the text value of the first node that matches the xpath expression. If the xpath expression matches more than one node, the function returns a warning. If the xpath expression does not match any node, the function returns an empty string. The function supports various XPath 1.0 syntax and functions, except for those that return a node set. There are some other XML functions that are related to the EXTRACTVALUE() function, such as UPDATEXML(), XMLQUERY(), and XMLTABLE(). These functions can help us manipulate and query XML documents.