Recently, I needed to do some data analysis for a project I was working on. The order references are alphanumeric strings, with a variable length prefix that indicates the origin system. For example, order references AB123456789 and MOB98765 have prefixes AB and MOB.
I have pulled the data into a SQL database table for analysis. I wanted to use the prefix in SQL group by and aggregrate functions. To accomplish this, I created a computed column for the order prefix.
ALTER TABLE datatable ADD order_ref_prefix AS (stuff(order_ref, patindex('%[0-9]%', order_ref), 10000, ''))
This computed column make use of two SQL server functions patindex and stuff. Patindex returns the starting position of the first occurrence of a pattern in a string. In this example, it returns the first occurence of a number in the order reference. Stuff replaces part of a string with another. Here, we replace the substring between the start of the first number in the order reference to position 10000 with an empty string. This effectively truncates the order reference from the first number found.
Once I have the order prefix in its own column, I can use them in queries like these:
select order_ref_prefix, count(*) as occurrence from datatable group by order_ref_prefix order by occurrence desc
select * from database where order_ref_prefix = 'MOB' and <other predicates>