Some Basic to advanced Kusto Queries

Gibin Francis
3 min readApr 16, 2021

Here am inclusing some basic and advanced Azure Data Explorer Queries (ADX) or Kusto Queries which i met during my development journey.

Normal Tables

Create table in ADX

.create table tableName(ColumnName:datatype)

Alter table in ADX

.create-or-alter table tableName(ColumnName:datatype)

Create JSON Mapping, This will be used to ingest JSON data to ADX

.create table tableName ingestion json mapping “mappingName”
‘[‘
‘ { “column” : “tableColumnName”, “Properties”:{“Path”:”$.jsonproperty”}},’
‘ { “column” : “tableColumnName”, “Properties”:{“Path”:”$.jsonproperty.innerjsonproperty”}},’
‘ { “column” : “full_text”, “Properties”:{“Path”:”$”}}’
‘]’

Drop Table

.drop table tableName

Clear Table Data

.clear table tableName data

Alter data type of the column

.alter column [‘tableName’].[‘tableColumnName’] type=datatype

Ingest directly into tables, using tags while ingesting will help to remove the data based on tag, we can inset the same without tag also

.ingest inline into table tableName with(tags=’[“tagName”]’) <|
column1_data1,column2_data1
column1_data2,column2_data2

delete ingested data using tag

.drop extents <| .show table tableName extents where tags has “tagName”

ADX use one default ingestion policy, to alter ingestion latency. We can update the conditions based on the need

.alter table tableName policy ingestionbatching 
‘{“MaximumBatchingTimeSpan”: “00:00:10”, “MaximumNumberOfItems”: 10, “MaximumRawDataSizeMB”:100}’

View Ingestion mapping of the database

.show database databaseName policy ingestionbatching

View Ingestion mapping of the table

.show table databaseName.tableName policy ingestionbatching

External Tables

Create External Table. We can use blob/adls as the storage kind also we can change partition based on the need

.create external table externalTableName(ColumnName:datatype)
kind=blob
partition by (DateTimeSource:datetime = bin(DateTimeSource, 1d))
dataformat=csv
(
h@’blobConnectionString'
)

Continuous Export

Create continuously export for a table. This will export the data from a table to another table

.create-or-alter continuous-export continousExportName
over (tableName)
to table externalTableName
with
(intervalBetweenRuns=15m,
forcedLatency=5m)
<| tableName

disable continuous export

.disable continuous-export continousExportName

Queries

One important note on the kusto queries as these conditions will run as chained queries.

Get count

tableName | count;

Take rows from entire list

tableName | take 10

Get the 100 rows after sorting the table with the specified column name

tableName | order by columnName desc | take 100

Prepare timeseries from data. Will count the data between the time range and prepare time chart with one hour bin interval

let min_t = toscalar(tableName| summarize min(columnName));
let max_t = toscalar(tableName| summarize max(columnName));
tableName | make-series num=count() default=0 on timeStampColumn in range(min_t, max_t, 1h) | render timechart

Prepare timeseries with average of column values

let min_t = toscalar(tableName| summarize min(columnName));
let max_t = toscalar(tableName| summarize max(columnName));
tableName | make-series num=avg(todouble(columnName)) default=0 on timeStampColumn in range(min_t, max_t, 1h) | render timechart

Get data between time range with string c, and count it

tableName
| where timeStampColumn between (todatetime(“2020–05–21T14:00:00.000Z”) .. todatetime(“2020–05–22T02:00:00.000Z”))
| count ;

Get data between time raneg with datatime columns

tableName
| where timeStampColumn between (datetime(2020–07–16 14:15:50) .. datetime(2020–07–16 14:15:55))

Group the data based on column with highest timestamp

tableName
| where timeStampColumn > ago(15m)
| summarize arg_max(timeStampColumn, *) by groupByColumnName

Get The maximum valued row

tableName
| where columnName== toscalar(tableName| summarize max(columnName));

Get Current user email who is using the ADX

current_principal_details()[“UserPrincipalName”]

Using “in” operator to check matching data from a list

Using “extend” operator to create additional column with calculated information. Columns added by “extend” operator will only be avaiable with the result set.

Using “prev” operator to find the previous row value of the column.

tableName
| where columnName in (‘value1’, ‘value2’)
| extend NewColumn= DateTimeSource — prev(DateTimeSource),
NewColumn2= prev(DateTimeSource, 2) — prev(DateTimeSource)

Joining Multiple Table Contions and Calculate the count of all

union withsource=’TempTableName’
(tableName1| where condition),(tableName2| where condition)
| count

Calculate time difference

tableName
| project diff=datetime_diff(‘minute’, timeStampColumn1, timeStampColumn2)

Dynamically taking the bin size based on the date time gap

let [‘_startTime’]=datetime(‘2021–02–16T08:51:11Z’);
let [‘_endTime’]=datetime(‘2021–02–17T09:51:11Z’);
tableName
| where timeStampColumn between ([‘_startTime’] .. [‘_endTime’])
| summarize Count = count() by summerizeColumn, bin(timeStampColumn, case( datetime_diff(‘hour’,[‘_endTime’],[‘_startTime’])<= 24, 10m,
datetime_diff(‘hour’,[‘_endTime’],[‘_startTime’]) <= 480, 1h,
1d))
| render timechart

Get data size of your table

tableName 
| summarize Total=sum(estimate_data_size(*))

Get data size as chart for a period of days

tableName 
| where timeStampColumn > ago(14d)
| make-series num=sum(estimate_data_size(*)) default=0 on timeStampColumn in range(ago(14d), ago(1m), 1d)
| render timechart

Get Duplicate count fom the data

tableName
| summarize count() by Columns
| where count_ >1

Will add more in future..

--

--

Gibin Francis

Technical guy interested in MIcrosoft Technologies, IoT, Azure, Docker, UI framework, and more