Tuesday, June 13, 2023

Extracting Number from String in Alteryx

 I was looking at challenge 120 on Alteryx, where you have to parse a string to extract a number. I struggled with the RegEx tool and ultimately peeked at the solution. This got me thinking that there has to be an easier way which works for all strings and does not involve me pulling my hair out to get the right regex pattern.

The answer was using a Formula tool instead of RegEx, and within formula editor use the REGEX_Replace function:


REGEX_Replace([StringColumn], "[^\d]+", "")


Here's a breakdown of the formula:

  • [StringColumn] is the name of the column or field containing the string you want to extract the number from. Replace it with the actual column name in your workflow.
  • "[^\d]+" is the regular expression pattern that matches any non-digit character ([^\d]) one or more times (+).

The REGEX_Replace function will replace all non-digit characters in the string with an empty string, effectively extracting the number.

To use this formula:

  1. Add a Formula tool to your Alteryx workflow canvas.
  2. Connect the input data to the Formula tool.
  3. Double-click on the Formula tool to open its configuration window.
  4. In the configuration window, select the column or field where the string is located.
  5. In the formula editor, enter the formula REGEX_Replace([StringColumn], "[^\d]+", ""), replacing [StringColumn] with the actual column name.
  6. Configure other options in the Formula tool as per your requirements.
  7. Run the workflow to extract the number using the specified formula.

By using the REGEX_Replace function with the provided formula, you should be able to extract the number from the string in Alteryx without using a separate tool.