Monitoring performance of natively compiled stored procedures – database-scoped configuration options |
Метки: Azure SQL Database SQL Server vNext in-memory oltp |
How to configure network for Azure SQL Managed Instance |
Метки: Azure SQL Database SQL Server Managed Instance |
Demo – Azure SQL Database Managed Instance in action |
Метки: Azure SQL Database SQL Server Azure Sql Db Backup CLR Managed Instance Restore Service Broker SQL Agent |
How to configure tempdb in Azure SQL Managed Instance(preview) |
Метки: Azure SQL Database Managed Instance SQL Agent TempDB |
What is Azure SQL Database Managed Instance? |
Метки: Azure SQL Database SQL Server Azure Managed Instance |
February 2018 Leaderboard of Database Systems contributors on MSDN |
Метки: Azure SQL Data Warehouse Azure SQL Database SQL Server Azure Sql Db |
Migrating to Azure SQL Database with Zero Downtime for Read-Only Workloads |
Метки: Azure SQL Database |
New Windows container images of SQL Server 2017 on Windows Server 1709! |
Метки: SQL Server SQL Server 2017 SQL Server vNext |
January 2018 Leaderboard of Database Systems contributors on MSDN |
Метки: Azure SQL Data Warehouse Azure SQL Database SQL Server |
JSON parsing - performance comparison |
One of the first questions that people asked once we announced JSON support in SQL Server 2016 was "Would it be slow?" and "How fast you can parse JSON text?". In this post, I will compare performance of JSON parsing with JSON_VALUE function with the XML and string functions.
In this experiment I have compared JSON_VALUE with xml.value, and few string functions. In the experiment I have repeated the same function call 100000 times and calculated average execution time:
DECLARE @dt1 datetime2, @dt2 datetime2
declare @spentTime float = 0
declare @i int = 0
declare @s nvarchar(100)
while(@i<100000)
begin
DECLARE @json AS NVARCHAR(200) = '{"Root":{"ProductDescription":{"ProductID":"1","ProductName": "Road Bike","Features": {"Warranty": "1 year parts and labor","Maintenance": "3 year parts and labor extended maintenance is available"}}}}'
-- Start
set @dt1 = getdate()
set @s = JSON_VALUE(@json, N'$.Root.ProductDescription.ProductName')
-- set @s = CHARINDEX('ProductName', @json)
-- set @s = PATINDEX('%ProductName%', @json)
-- set @s = SUBSTRING(@json, 10, 10)
--
set @s = SUBSTRING(@json, CHARINDEX('ProductName', @json)+15, CHARINDEX('"', @json, CHARINDEX('ProductName', @json)+15) - (CHARINDEX('ProductName', @json)+15))
set @dt2 = getdate()
set @spentTime += DATEDIFF(nanosecond, @dt1, @dt2)
set @i = @i+1
end;
SELECT 'Execution time is ' + CAST( (@spentTime / 100000 / 1000) as nvarchar(100) ) + ' micro-seconds per call'
I have created one JSON string and applied functions JSON_VALUE, CHARINDEX, PATINDEX and SUBSTRING in 100000 iterations. Total time is calculated as a sum of duration of individual function calls. In each experiment I have uncommented function that I want to test and commented-out other ones.
I have also added one more complex string expression:
SUBSTRING(@json, CHARINDEX('ProductName', @json)+15, CHARINDEX('"', @json, CHARINDEX('ProductName', @json)+15) - (CHARINDEX('ProductName', @json)+15))
This expression is hard-coded simulation of JSON_VALUE function. I'm trying to extract substring between "ProductName" key, skip three characters after this key (i.e. ":" sequence) and find the rest of the text until the double quote. This expression will return the same result as JSON_VALUE function although it will not be context sensitive and it would not work in more complex JSONs because it cannot be aware is ProductName actually key or some random text. However, it is good enough for performance comparison.
To test XML value I will use slightly modified input that uses XML instead of JSON:
DECLARE @dt1 datetime, @dt2 datetime
declare @spentTime float = 0
declare @i int = 0
declare @s nvarchar(100)
while(@i<100000)
begin
DECLARE @xml as xml = N'
3 year parts and labor extended maintenance is available
1 year parts and labor
1
Road Bike
'
-- Start
set @dt1 = getdate()
set @s = @xml.value('(/Root/ProductDescription/ProductName)[1]', 'nvarchar(400)')
set @dt2 = getdate()
set @spentTime += DATEDIFF(nanosecond, @dt1, @dt2)
set @i = @i+1
end;
SELECT 'Execution time is ' + CAST( (@spentTime / 100000 / 1000) as nvarchar(100) ) + ' micro-seconds per call'
Here I have just converted JSON into equivalent XML and used XPath expression instead of JsonPath.
In the following table are listed results of tests:
function | time (microseconds) |
@xml.value | 30-50 |
JSON_VALUE | 11-13 |
PATINDEX | 7-9 |
CHARINDEX | 7-9 |
SUBSTRING | 1-2 |
SUBSTRING+CHARINDEX | 18-22 |
The interesting thing we can notice is that parsing JSON text is 3-5x faster than parsing equivalent XML.
JSON_VALUE is x2 slower than PATINDEX and CHARINDEX, which is expected since these functions have much simpler parsing logic. However, if we are aware that JSON_VALUE use more complex parsing rules that are context aware (JSON_VALUE need to remember paths of parent objects) and conversion of escaped characters, this might not be too bad.
Direct access via position in SUBSTRING is the 10x faster than JSON_VALUE and 5-6x faster than PATINDEX/CHARINDEX, which is also expected.
The interesting result is simulation of JSON_VALUE via SUBSTRING and CHARINDEX functions. Although these functions use simple matches they are slower than parser used by JSON_VALUE. The reasons might be the facts the we need to scan JSON text three time to get the char indexes and also the fact that JSON parser use exact character matches and string function use collation aware rules for matching.
This test is executed on SQL Server 2016 CTP3.2 you can copy paste examples and run them on your own instances to check what are the ratios in your environments.
Метки: JSON SQL2016 |
Returning spatial data in GeoJSON format - Part 2 |
In this post I will show you how to export other spatial types such as lines or polygons, into equivalent GeoJSON formats.
...(read more)
Метки: JSON SQL2016 |
Clustered Columnstore Index: Data Load Optimizations - Minimal Logging |
When bulk importing large number of rows, you want to make sure it can be done efficiently with minimal resource usage so that that you can finish the data load quickly and start running your analytics. SQL Server provides following three key optimizations during bulk import
Just like regular tables where data is physically stored as rows, these optimizations are available in tables with columnstore indexes as well but there are some differences. This series of blog explains how each of these optimizations in the context of table with columnstore indexes.
Minimal Logging
Unlike common misconception, it does not mean ‘no logging’. Minimal logging logs the page allocation related info but no data rows are logged. You may wonder why could SQL not eliminate logging altogether because you can ‘reload’ the data. The issue is that if the data load fails, SQL will need to roll back the transaction and will have no way of knowing which pages were allocated. At this point, hopefully I have convinced you that minimal logging is good to minimize any logging related bottleneck. Let us now look at minimal logging for tables with clustered columnstore indexes (CCI).
When you import the data into CCI, the data is loaded either into delta rowgroup or compressed rowgroup based on the rule “If batchsize < 102400, the rows are loaded into delta rowgroup otherwise they are directly loaded into compressed rowgroup”. Minimal logging is only supported when data is directly loaded into compressed rowgroup. This means, to get minimal logging with CCI, you must use a batchsize >= 102400.
Here are couple of examples
-- create the table
create table t_bulkload (
accountkey int not null,
accountdescription char (500),
accounttype char(500),
AccountCodeAlternatekey int)
-- Let us prepare the data
-- insert 110K rows into a regular table
begin tran
declare @i int = 0
while (@i < 110000)
begin
insert into t_bulkload values (@i, 'description', 'dummy-accounttype', @i*2)
set @i = @i + 1
end
commit
-- bcp out the data... run the following in command window
bcp columnstore..t_bulkload out c:\temp\t_bulkoad-1.dat -c -T
--truncate the table
truncate table t_bulkload
-- create clustered columnstore index
CREATE CLUSTERED COLUMNSTORE index t_bulkload_cci on t_bulkload
-- clean up the transaction log (note, the database is
-- simple recovery mode)
Checkpoint
-- now bulkload the data
begin tran
-- this loads the data in 1 batch
bulk insert t_bulkload
FROM 'c:\temp\t_bulkoad-1.dat'
-- show the log records generated
-- note, I am using ‘log record length 1104’ because I know the length of the
-- insert log record.
select count(*)
from sys.fn_dblog(NULL, NULL) as t_log
where allocunitname like '%t_bulkload%' and operation like '%LOP_INSERT_ROWS%' and [Log Record Length] = 1104
You will see that it returns 0 as no data rows were logged. Abort the transaction and run the checkpoint command to cleanup the log records. Now let us run the same command but with the batchsize of 10000. Recall, this will load the rows into delta rowgroup
begin tran
-- Now use a smaller batchsize of 10000
-- I am using even TABLOCK
bulk insert t_bulkload
FROM 'c:\temp\t_bulkoad-1.dat'
WITH
(
BATCHSIZE = 10000
,TABLOCK
)
-- show the log records generated
select count(*)
from sys.fn_dblog(NULL, NULL) as t_log
where allocunitname like '%t_bulkload%' and operation like '%LOP_INSERT_ROWS%' and [Log Record Length] = 1104
You will see that the count is 110000, exactly the number of rows we inserted. In other words, the data rows are fully logged (i.e. no minimal logging). So the question is why did we not get minimal logging with delta rowgroup. There are two reasons. First, the delta rowgroup is organized as a btree, not a HEAP. Second, we expect large number of rows to be loaded to CCI therefore have chosen to optimize data load directly into compressed rowgroup. One additional advantage of directly loading data into compressed rowgroup is that it eliminates the additional step of loading data into delta rowgroup and then migrating this data into compressed rowgroups.
There you have it; our recommendation is to choose a batchsize of > 102400 to get benefits of minimal logging with clustered columnstore index. In the next blog, I will discuss parallel bulk import and locking optimizations.
Thanks
Sunil Agarwal
Метки: ColumnStore Index |
Loading GeoJSON data into SQL Server |
GeoJSON is popular format for spatial data representation. If you receive text formatted as GeoJSON from other systems, you can load it into SQL Server and convert it into spatial types.
New OPENJSON function in SQL Server 2016 enables you to parse and load GeoJSON text into SQL Server spatial types.
In this example, I will load GeoJSON text that contains a set of bike share locations in Washington DC. GeoJSON sample is provided ESRI and it can be found in https://github.com/Esri/geojson-layer-js/blob/master/data/dc-bike-share.json
I will load this GeoJSON text using following query:
drop table if exists BikeShare
create table BikeShare(
id int identity primary key,
position Geography,
ObjectId int,
Address nvarchar(200),
Bikes int,
Docks int )
declare @bikeShares nvarchar(max) =
'{"type":"FeatureCollection",
"features":[{"type":"Feature",
"id":"56679924",
"geometry":{"type":"Point",
"coordinates":[-77.0592213018017,38.90222845310455]},
"properties":{"OBJECTID":56679924,"ID":72,
"ADDRESS":"Georgetown Harbor / 30th St NW",
"TERMINAL_NUMBER":"31215",
"LATITUDE":38.902221,"LONGITUDE":-77.059219,
"INSTALLED":"YES","LOCKED":"NO",
"INSTALL_DATE":"2010-10-05T13:43:00.000Z",
"REMOVAL_DATE":null,
"TEMPORARY_INSTALL":"NO",
"NUMBER_OF_BIKES":15,
"NUMBER_OF_EMPTY_DOCKS":4,
"X":394863.27537199,"Y":137153.4794371,
"SE_ANNO_CAD_DATA":null}
},
......'
-- NOTE: This GeoJSON is truncated.
-- Copy full example from https://github.com/Esri/geojson-layer-js/blob/master/data/dc-bike-share.json
INSERT INTO BikeShare(position, ObjectId, Address, Bikes, Docks)
SELECT geography::STGeomFromText('POINT ('+long + ' ' + lat + ')', 4326),
ObjectId, Address, Bikes, Docks
from OPENJSON(@bikeShares, '$.features')
WITH (
long varchar(100) '$.geometry.coordinates[0]',
lat varchar(100) '$.geometry.coordinates[1]',
ObjectId int '$.properties.OBJECTID',
Address nvarchar(200) '$.properties.ADDRESS',
Bikes int '$.properties.NUMBER_OF_BIKES',
Docks int '$.properties.NUMBER_OF_EMPTY_DOCKS' )
I have created a table BikeShare that will contain spatial data and I have defined local text variable @bikeShares that contains GeoJSON taken from https://github.com/Esri/geojson-layer-js/blob/master/data/dc-bike-share.json.
Then I will open GeoJSON rowset from @bikeShares variable using OPENJSON function. OPENJSON will return the one row for each object in GeoJSON array with the schema defined in WITH clause:
Values in long and lat are used to initialize spatial type using geography::STGeomFromText method, and other values are inserted in table columns.
When I execute this script, I can query spatial data loaded from GeoJSON text:
select position.STAsText(), ObjectId, Address, Bikes, Docks
FROM BikeShare
If you try to execute the same script you will see that all objects from GeoJSON variable are not in the table.
OPENJSON function enables you to parse any JSON text. You can use this function to convert GeoJSON format into SQL Server spatial types.
Метки: JSON SQL2016 |
Returning spatial data in GeoJSON format - Part 1 |
GeoJSON is popular format used to represent spatial data. An example of GeoJSON text that represents one point is shown in the following example:
{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [125.6, 10.1]
},
"properties": {
"name": "Dinagat Islands"
}
}
In 'geometry' object are placed type of the spatial data and coordinates. In "property" object can be placed various custom properties such as address line, town, postcode and other information that describe object. SQL Server stores spatial information as geometry or geography types, and also stores additional properties in standard table columns.
Since GeoJSON is JSON, it can be formatted using new FOR JSON clause in SQL Server.
In this example, we are going to format content of Person.Address table that has spatial column SpatialLocation in GeoJSON format using FOR JSON clause. Person.Address contains set of point so in this example I will generate an array of GeoJSON point.
Query looks like:
select TOP 20
'Feature' AS type,
'Point' as [geometry.type],
JSON_QUERY
( FORMATMESSAGE('[%s,%s]',
FORMAT(SpatialLocation.Long, N'#.##################################################'),
FORMAT(SpatialLocation.Lat, N'#.##################################################'))
) as [geometry.coordinates],
AddressID as [property.id], AddressLine1 as [property.address],
City as [property.city], PostalCode as [property.postcode]
FROM Person.Address
FOR JSON PATH
First I have selected 'Feature' literal as 'type' because this is the first key:value pair in GeoJSON example above. Then I need to select 'Point' literal and place it on geometry.type path in JSON. These are just fixed values that must exist in the GoeJSON output.
Now we have tricky part - how to format latitude and longitude as JSON coordinates? I have used FORMATMESSAGE and FORMAT functions to create JSON array from longitude/latitude information from spatial column. If I just return this string, FOR JSON will see it as a plain text are wrap it with double quotes, e.g.:
"coordinates": "[-122.164644615406, 47.7869921906598]"
However, I don't want coordinates to be outputted as string - I need JSON array without surrounding double quotes, i.e.:
"coordinates": [-122.164644615406, 47.7869921906598]
Therefore I need to pass this string to JSON_QUERY function. JSON_QUERY will return input string because there is no selector as second parameter. However, FOR JSON will know that the output of JSON_QUERY is valid JSON text and it will not escape the content.
The last part is easy - I will just return columns from the table and format them in property object. FOR JSON ATH will format JSON using prefixes in the column aliases.
The final output might look like:
[{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-122.164644615406, 47.7869921906598]
},
"property": {
"id": 1,
"address": "1970 Napa Ct.",
"city": "Bothell",
"postcode": "98011"
}
}, {
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-122.250185528911, 47.6867097047995]
},
"property": {
"id": 2,
"address": "9833 Mt. Dias Blv.",
"city": "Bothell",
"postcode": "98011"
}
}]
As you can see with new FOR JSON clause we can easily output any JSON format such as GeoJSON.
Метки: JSON SQL2016 |
Indexing data in JSON product catalogs |
In the previous post Designing Product Catalogs in SQL Server 2016 using JSON, I have described how you can design product catalog with JSON. Product catalog contains few columns that are common to all types of products and all custom/specific columns are stored as JSON key:value pairs:
This is simple, flexible and generic structure that can contain any product type. The only problem is that we will use full table scan every time we use JSON_VALUE in some predicate.
In this post I will show how you can add indexes on JSON properties in product catalog. In SQL Server 2016, you can use two type of indexes on JSON text:
If you know that you will frequently filter rows in products catalog using some specific path, you can expose value on that path using computed column and create standard NONSLUSTERED index on that column. As an example, imagine that we are filtering rows by ProductNumber property. We can index that property using the following script:
ALTER TABLE ProductCatalog
ADD vProductNumber AS JSON_VALUE(Data, '$.ProductNumber')
CREATE INDEX jidx_ProductNumber
ON ProductCatalog(vProductNumber)
Note that computed column is not PERSISTED. It will be calculated only when a product_number value in JSON column is changed and when index should be updated. However, there is no additional table space in this case. This index will be used if you add filter predicate that uses $.ProductNumber property in JSON_VALUE function:
How fast is this solution? Performance are same as performance of indexes on regular columns because the same index is used in both cases.
Note that you can add more than one column in the index, add both computed columns with JSON_VALUE function and standard column in the table, put some columns in INCLUDE section of index to create fully covered indexes, use filtered indexes, etc. This way you are using all benefits of standard SQL Server indexes on JSON values.
Full text indexes breaks JSON text and enables you to find some words in the text. You can create full text index on JSON column:
-- Assumption is that FT catalog is already created, e.g.:
-- CREATE FULLTEXT CATALOG jsonFullTextCatalog;
-- Create full text index on SalesReason column.
CREATE FULLTEXT INDEX ON ProductCatalog(Data)
KEY INDEX PK_ProductCatalog_ID
ON jsonFullTextCatalog;
Now you can use CONTAINS function that checks does JSON text contains some text value. The interesting feature is NEAR operator that enables you to check are two words near each other:
SELECT ProductID, Name, JSON_VALUE(Data, '$.ProductNumber') AS ProductNumber, ProductModelID, JSON_VALUE(Data, '$.Color') AS Color
FROM ProductCatalog
WHERE CONTAINS(Data, 'NEAR((Color,Black),1)')
This query will locate all rows where 'Color' and 'Black' are on distance 1 - this is ideal for key:value pairs that are naturally on distance 1.
Note that in some cases this might produce false positive results (i.e. if some other value is 'Color', and the key that is placed after this value is 'Black'). If you believe that this might happen you can add additional predicate JSON_VALUE(Data, '$.Color') = 'Black' that will remove false-positive rows (see example below).
FTS will index all key:value pairs in the document and with NEAR operator we can create complex and flexible queries:
SELECT ProductID, Name
FROM ProductCatalog
WHERE CONTAINS(Data, 'NEAR((Color,Silver),1) AND NEAR((MakeFlag,true),1) AND NEAR((SafetyStockLevel,100),1)')
This query will use full text search and seen into table to get the ProductID and Name columns:
As you might notice, you can use any key:value pair in CONTAINS expression. FTS index will automatically index all key values pairs and enable you to query any of them using NEAR operator.
The only constraint here is the fact that values should be single words because if there is a longer text in the value, FTS will break it into separate words and distance from tokens in the value will be different for every token. However, in simple JSON texts (like the one that is imported from AdventureWorks database) you will be able to index all key values without any problem.
Problem with this approach is the fact that FTS is not aware that there is some JSON structure and (thanks to xor88 for pointing out this). Therefore FTS might return some rows that do have words Color and Silver near to each other and it will not be aware that Color might not be a JSON key we want. Therefore, we need to add additional filters to remove false positive results:
SELECT ProductID, Name
FROM ProductCatalog
WHERE CONTAINS(Data, 'NEAR((Color,Silver),1) AND NEAR((MakeFlag,true),1) AND NEAR((SafetyStockLevel,100),1)')
AND JSON_VALUE(Data,'$.Color') = 'Silver'
AND JSON_VALUE(Data,'$.MakeFlag') = 'true'
AND JSON_VALUE(Data,'$.SafetyStockLevel') = '100'
In this query FTS/CONTAINS is used to find candidate rows that might satisfy condition we need. In my example it return 16 out of 200.000 rows but some of them might have not have words in the right JSON context. Then the additional filter with JSON_VALUE will check that returned rows actually have these key:value pairs. To summarize:
You can notice that full text match will return candidate rows, and index seek will find them in ProductCatalog. You can also notice in the plan that result of full text match is pushed down as index seek predicate. Then Filter operator cleans potential false positive results and we are getting the right results on the output.
This way we can have efficient search by any product field.
Метки: SQL Server 2016 JSON |
Designing Product Catalogs in SQL Server 2016 using JSON |
In this post we will see how we could design flexible product catalogs in SQL Server using JSON.
...(read more)
Метки: SQL Server 2016 JSON |
WITHOUT_ARRAY_WRAPPER - New important change in FOR JSON |
In SQL Server 2016 CTP3.2 is added new option in FOR JSON clause - WITHOUT_ARRAY_WRAPPER see https://msdn.microsoft.com/en-us/library/mt631354.aspx
This option enables you to remove square brackets [ and ] that surround JSON text generated by FOR JSON clause. I will use the following example:
SELECT 2015 as year, 12 as month, 15 as day FOR JSON PATH, WITHOUT_ARRAY_WRAPPER
This query will return:
{ "year":2015, "month":12, "day":15 }
However, without this option, following text would be returned:
[{ "year":2015, "month":12, "day":15 }]
This might be useful if you know that you are returning single row in the result set (e.g. if you return row by primary key, or use TOP 1), and you don't want array with one element in output.
Also, it might be useful if you want to concatenate two result sets from two queries formatted as JSON. In the earlier version you would need to remove wrapper, concatenate two outputs and wrap them with brackets to create valid JSON array:
set @merged = '[' + dbo_fnUnwrap( (SELECT .... FOR JSON PATH ) ) + ',' + dbo_fnUnwrap( (SELECT .... FOR JSON PATH ) ) + ']'
Now you don't need additional function that removes brackets:
set @merged = '[' + (SELECT .... FOR JSON PATH, WITHOUT_ARRAY_WRAPPER ) + ',' + (SELECT .... FOR JSON PATH, WITHOUT_ARRAY_WRAPPER ) + ']'
Why it is important? FOR JSON functionality is changed compared to CTP3.1 version. In previous versions there were some problems with FOR JSON PATH:
Therefore, if you relied on the fact that FOR JSON will sometime return a single JSON object beware that this behavior is changed. You would need to add WITHOUT_ARRAY_WRAPPER in your script if you want a single object.
Note one important thing - WITHOUT_ARRAY_WRAPPER will not generate valid JSON text. If you try to put FOR JSON WITHOUT_ARRAY_WRAPPER as column expression, it will be treated as a plain text, e.g.:
SELECT col1, col2, col3, (SELECT .... FOR JSON PATH, WITHOUT_ARRAY_WRAPPER ) AS NestedJson
FROM mytab
FOR JSON PATH
FOR JSON path will treat NestedJson as any other plain text escape it and surround it with double quotes.
Метки: SQL Server 2016 JSON |
Columnstore Index: Real-Time Operational Analytics using In-Memory Technology |
Starting with SQL Server 2016, you can implement real-time operational analytics by combining the OLTP workload with columnstore index. SQL Server offers three possible configurations (a) disk-based tables with nonclustered columnstore index (b) memory-optimized tables with columnstore index (c) clustered columnstore index with one or more btree indexes. I will blog in detail about these in coming weeks but the following blog that I just posted is a good overview. Look forward to your feedback/questions
regards
Sunil
Метки: ColumnStore Index |
CHECKSUM and BINARY_CHECKSUM might not return unique values for different inputs |
CHECKSUM and BINARY_CHECKSUM create checksums for text inputs. Checksum values should be different for different input text. However, in some cases, returned values are not always unique for different inputs.
You can find patterns of inputs that return same values. This is similar to modulo operator that returns the same remainder if you find the right pattern of numbers.
As an example, if you run following queries you will find that these functions return 0 for all strings containing only letter 'a' that have lengths 32, 64, 96, n * 32 characters:
select CHECKSUM(REPLICATE('a',32)), CHECKSUM(REPLICATE('a',64)), CHECKSUM(REPLICATE('a',96))
select BINARY_CHECKSUM(REPLICATE('a',32)), BINARY_CHECKSUM(REPLICATE('a',64)), BINARY_CHECKSUM(REPLICATE('a',96))
So if your text contains only letter 'a' and has length 26000 (i.e. 812*32) it will be in the same bucket as other values in this example and you will get value 0.
However, if you use text with 33, 65, 97, n * 32 + 1 characters 'a', CHECKSUM will return 142, and BINARY_CHECKSUM will return 97:
select CHECKSUM(REPLICATE('a',33)), CHECKSUM(REPLICATE('a',65)), CHECKSUM(REPLICATE('a',97))
select BINARY_CHECKSUM(REPLICATE('a',33)), BINARY_CHECKSUM(REPLICATE('a',65)), BINARY_CHECKSUM(REPLICATE('a',97))
Conclusion: Do not assume that checksum will always return unique values for different texts. In some rare cases you can get the same values for text values that follows the same pattern.
Метки: General |
Create relational view over JSON text |
SQL Server 2016 enables you to use JSON_VALUE, JSON_QUERY, and OPENJSON functions to fetch data from JSON text. However, if you want to to query JSON data using standard relational models you can create views that encapsulate these functions.
In AdventureWorks2016CTP3 database is added Sales.SalesOrder_json table with two columns:
You would need to run de-normalization script (attached in this post) to create and populate these columns.
If you want to access information in Info column, you can create standard view that encapsulates JSON_VALUE functions that access values from JSON column.
CREATE VIEW Sales.vwSalesOrderInfo_json AS
SELECT SalesOrderNumber,
OrderDate, ShipDate, Status, AccountNumber, TotalDue,
JSON_VALUE(Info, '$.ShippingInfo.Province') as [Shipping Province],
JSON_VALUE(Info, '$.ShippingInfo.Method') as [Shipping Method],
JSON_VALUE(Info, '$.ShippingInfo.ShipRate') as ShipRate,
JSON_VALUE(Info, '$.BillingInfo.Address') as [Billing Address],
JSON_VALUE(Info, '$.SalesPerson.Name') as [Sales Person],
JSON_VALUE(Info, '$.Customer.Name') as Customer
FROM Sales.SalesOrder_json
In the relational model you would need to join a lot of tables to get the same information, i.e.:
CREATE VIEW Sales.vwSalesOrderInfoRel_json AS
SELECT SalesOrderNumber, OrderDate, ShipDate, Status, Sales.SalesOrderHeader.AccountNumber, TotalDue,
shipprovince.Name as [Shipping Province],
shipmethod.Name as [Shipping Method],
shipmethod.ShipRate as ShipRate,
billaddr.AddressLine1 + COALESCE ( ', ' + shipaddr.AddressLine2, '') as [Billing Address],
sp.FirstName + ' ' + sp.LastName as [Sales Person],
cust.FirstName + ' ' + cust.LastName as Customer
FROM Sales.SalesOrderHeader
JOIN Person.Address shipaddr
ON Sales.SalesOrderHeader.ShipToAddressID = shipaddr.AddressID
LEFT JOIN Person.StateProvince shipprovince
ON shipaddr.StateProvinceID = shipprovince.StateProvinceID
JOIN Purchasing.ShipMethod shipmethod
ON Sales.SalesOrderHeader.ShipMethodID = shipmethod.ShipMethodID
JOIN Person.Address billaddr
ON Sales.SalesOrderHeader.BillToAddressID = billaddr.AddressID
LEFT JOIN Sales.SalesPerson
ON Sales.SalesPerson.BusinessEntityID = Sales.SalesOrderHeader.SalesPersonID
LEFT JOIN Person.Person AS sp
ON Sales.SalesPerson.BusinessEntityID = sp.BusinessEntityID
LEFT JOIN Sales.Customer
ON Sales.Customer.CustomerID = Sales.SalesOrderHeader.CustomerID
LEFT JOIN Person.Person AS cust
ON Sales.Customer.CustomerID = cust.BusinessEntityID
In Sales.SalesOrder_json table we also have OrderItems column that contains content of related SalesOrderDetails table. You can transform this array to rowset using the following view:
CREATE VIEW Sales.vwSalesOrderItems_json
AS
SELECT SalesOrderID, SalesOrderNumber, OrderDate,
CarrierTrackingNumber, OrderQty, UnitPrice, UnitPriceDiscount, LineTotal, ProductNumber, Name
FROM Sales.SalesOrder_json
CROSS APPLY
OPENJSON (OrderItems)
WITH ( CarrierTrackingNumber NVARCHAR(20),
OrderQty int '$.Item.Qty',
UnitPrice float '$.Item.Price',
UnitPriceDiscount float '$.Item.Discount',
LineTotal float '$.Item.Total',
ProductNumber NVARCHAR(20) '$.Product.Number',
Name NVARCHAR(50) '$.Product.Name'
)
This view returns the same information as original SalesOrderDetails table.
Метки: SQL Server 2016 JSON |