Introduction
After Microsoft announced Microsoft Fabric at the build conference starting on 23 May 2023, Microsoft Fabric was released in public preview. It has many features to unify your data estate and reshape how your entire team consumes data. Built entirely on delta lake, the improved serverless engine will make some waves in the data world.
Here is how to connect your Microsoft Fabric KQL (Kusto Query Language) Database to Azure Event Hubs streaming. You can combine these features to stream your data to a KQL store in Microsoft Fabric. This will not be the only source or destination as we shall see further in this post.
Components you'll need:
- Azure subscription
- Event Hub namespace + Event hub
- Microsoft Fabric account (trial version)
- Streaming dataset (If using a built-in generated dataset, this can be skipped)
For the event hub namespace, you can use the basic pricing tier, but you might run in to the consumer group limitation (1 possible consumer group). You will also no longer see any information for the $Default consumer group in the portal once you connect your Microsoft Fabric KQL DB to the consumer group.
You can still change this after creating the event hub namespace.
Real-time Analytics
Step 1 - Getting set up
Use your favorite browser to sign in to https://app.fabric.microsoft.com/
We will be primarily using the real-time analytics tab (last one), so go ahead and select that to go to the RTA home page.
From here we can easily create a new KQL Database by selecting the top left KQL Database (Preview) button and providing a meaningful name. (TEST is meaningful right, right?).
After this you should see your database opened at the Real-Time Analytics tab.
If you somehow lost track of where your newly created KQL Database is, simply go back to your workspace tab and apply some filters. For example on name or on KQL Database (or both).
Step 2 - Start ingesting data
Now is the time to start ingesting data into the KQL database.
We can either select get data -> Event Hubs (see image on right) or...
... create a painting!
Then there's the important step of connecting to our Event Hubs + Event Hub namespace.
Step 3 - Connecting to the cloud
At the Event Hubs data source tab, select Create new cloud connection.
This will open another browser tab (that automatically closes, see my previous blogpost on “5 tips to get started with Microsoft Fabric” on how to manage connections without the autoclose happening.
The first part is very easy, you choose a connection name, hopefully more meaningful than mine. After this, you enter the Azure Event Hub namespace + Azure event hub belonging to the namespace that you want to connect to.
Step 4 - Authentication
For the basic authentication (which is the only one currently possible) you will need to enter a shared access policy. For this you will be using the policy name and the key value (primary or secondary) belonging to it. The policy name = username, the key value = password.
You could use the default one on the namespace, but it is advisable to create one for the specific hub you are connecting to. You also only need the listen right on this event hub. So hopefully we will have more authentication methods in the future, but this is the current least privileged approach.
Step 5 - Preparing a data set
After performing this, first make sure you have a sample dataset ready for this consumer group before going to schema. When on the schema tab, fabric will immediately try to use data from the event hub to create a schema.
If you don’t have any data ready, you could try using the preview feature to generate data in the event hub itself.
Step 6 - Finishing up
With this data ready, we can go into schema and see our schema being built for our event hub:
Make sure you select the correct data format here. When testing this, it defaulted to TXT while it was JSON-data. Go to Summary and hopefully you will see green checkmarks accross the board.
After this you are done! You should see your Event Hub Data shortly being populated in your KQL Database, you can easily query the table by selecting the “...” --> “Query Table” --> “Show Any 100 Records” or “Records ingested in the last 24 hours”.
In conclusion
While this set-up works for streaming your event hub data to a KQL-database, there is less visibiltiy on source and target destination.
Another issue is having no easier way on setting up multiple sources or targets. As we will see in my next blogpost, there are also more capabilities to ingest the Event Hubs data to a lakehouse for example. To be able to do all of this and more, keep an eye on my next blog post on: “Using Eventstreams to feed your event hub data to Microsoft Fabric KQL database or Lakehouse”.