Why didn’t SSAS’s data mining tool receive more attention or more buzz in the tech world? There doesn’t seem to be a lot of documentation or tutorials beyond Microsoft’s AdventureWorks models. But it can be a very simple solution for predictions or analytic uses. Here’s my very simple explanation for setting up a sequence cluster model to predict URL paths. There will be 3 parts and today’s post is starting at the very beginning explaining how to model your data. You can have a sequence cluster up and running within an hour.
Correctly modelling your data is the hardest step of the whole process. However, once you understand the requirements and set up your tables correctly, the rest of the process is easy. The sequence cluster model requires two tables. The first is a case table which holds a key to identify each visit and other non-sequenced attributes, such as a customer identifier, visit date and customer type. Here’s an example of what is in this table .
| VisitId | CustomerId | VisitDate | CustomerType |
| 1 | 134123412 | 2-5-19 | P (for Purchaser) |
| 2 | 2352323 | 2-5-19 | F (for Free Registered Users) |
| 3 | 23523452345 | 2-5-19 | N (for non-Registered Users) |
| 4 | 323452345 | 2-5-19 | P |
The nested table holds the sequence data for each visit. Our data looks something like this:
| VisitId | SequenceId | URL |
| 1 | 1 | /level1/level2/page.aspx |
| 1 | 2 | /level1/level2/page2.aspx |
| 1 | 3 | /level1/level2/page3.aspx |
| 2 | 1 | /level1/level2/page4.aspx |
| 2 | 2 | /level2/level2/page2.aspx |
Tomorrow, I will walk you through using SQL Data Tools to set up your model.