Wednesday, October 18, 2017

Oops. When is a #SQLServer maximum not really *the* maximum? (Workspace Memory)

When is a maximum not really the maximum?
When it's a maximum for an explicitly or implicitly modified default.
Whether "the definitive documentation" says so or not.

Yesterday on Twitter #sqlhelp this question came up.


Aha! I thought to myself.  For this I am purposed! To show how Perfmon and DMV data tie out!

So I responded thusly in part I...


And thusly in Part II...


To wit, I included a clever graphic in the erstwhile conclusion of my rejoinder...

Indeed the internal resource pool defies limitation by the Resource Governor; the sum of resource semaphore target_memory_kb values for that pool is the true maximum workspace memory limit for the SQL Server instance.  But is that number necessarily the same as is reported by "SQLServer:Memory Manager\Maximum Workspace Memory (KB)"?

Based on the two sources I cited, that may look to be the case.  From the link above for the MemoryManager object:

But there are wrinkles.  As it turns out, the number reported in that perfmon counter is the sum of resource semaphore targets for the default pool.  Two factors can cause the default pool resource semaphores' targets to be lower than those of the internal pool: conditions in the max_memory_percent and min_memory_percent of Resource Governor pools other than the internal pool.

Consider what happens when default pool max_memory_percent is changed from 100 to 50, with min_memory_percent remaining at 0 for all other pools.  The RP_maxmem_100 pool still reports the same sum of target values as the internal pool.  Both of these pools report a total number twice as large as the default pool reports.  And its the default pool's sum of resource semaphore targets that gets reported by "SQLServer:Memory Manager\Maximum Workspace Memory (KB)".

OK, now lets bring default pool max_memory_percent back to 100.  What happens if a pool other than default has a nonzero min_memory_percent defined?  Pool RP_maxmem_100 has min_memory_percent 27 in the example below. Remember, the internal pool is unphased by design.  No other pool has a minimum memory set, so the sum of Pool RP_maxmem_100 resource semaphore targets is the as for the internal pool.  But the default pool now has a semaphore target total and "Query exec memory target" value of 100% - 27% = 73% => 73% * 192890640 = 140861040 (when rounded down to the nearest multiple of 8k).

As Paul Harvey would say:
Now you know... the rest of the story.

Ciao for now!

perfmon "SQLServer:Resource Pool Stats(*)\Query exec memory target (KB)" ties out well with sys.dm_exec_query_resource_semaphores

SELECT opc.[object_name], opc.instance_name, 
       opc.cntr_value AS [Query exec memory target (KB)], 
       SUM(qrs.target_memory_kb) total__target_memory_kb
FROM sys.dm_os_performance_counters opc
JOIN sys.dm_resource_governor_resource_pools rp ON opc.instance_name = rp.[name]
JOIN sys.dm_exec_query_resource_semaphores qrs ON rp.pool_id = qrs.pool_id
WHERE counter_name = 'Query exec memory target (KB)'
GROUP BY rp.pool_id, opc.[object_name], opc.instance_name, opc.cntr_value
ORDER BY rp.pool_id

And the results...

Tuesday, October 17, 2017

SQL Server 2016 - memory broker clerk DMV ties out well with perfmon

I invest time in trying to fit pieces of memory together, to understand how various types of accounting for database memory work... and work together.

Sometimes its tricky and takes me a while :-)

But today for the first time I tried to tie out sys.dm_os_memory_broker_clerks with equivalent counters from sys.os_performance_counters - the counters we get in perfmon.  I was surprised how easily sys.dm_os_memory_broker_clerks could be reconciled with perfmon.

First, a disclaimer.  The columnstore object pool changes pretty quickly once no queries are running.  And sys.os_performance_counters lags behind sys.dm_os_memory_broker_clerks.  It seems like values are pushed from the source for sys.dm_os_memory_broker_clerks to the source for sys.os_performance_counters on a regular interval - maybe every 2 seconds.  This is most noticeable when the broker clerk for columnstore disappears entirely from sys.dm_os_memory_broker_clerks but is still present in sys.os_performance_counters for a brief time.

OK... here are the queries I used to compare...

;WITH broker_clerks AS
      (SELECT instance_name, counter_name, cntr_value 
       FROM sys.dm_os_performance_counters 
       WHERE object_name = 'SQLServer:Memory Broker Clerks')
SELECT instance_name clerk_name, 8 * [Memory broker clerk size] AS [total_kb], 
8 * [Simulation size] AS [simulated_kb],
8 * [Periodic evictions (pages)] AS [periodic_freed_kb], 
8 * [Pressure evictions (pages/sec)] AS [internal_freed_kb]
FROM broker_clerks
PIVOT (MAX (cntr_value) 
       FOR counter_name IN ([Memory broker clerk size], [Simulation size], [Periodic evictions (pages)], [Pressure evictions (pages/sec)])
       ) broker_clerks_pvt
ORDER BY instance_name;

SELECT   clerk_name, total_kb, simulated_kb, periodic_freed_kb, internal_freed_kb
FROM     sys.dm_os_memory_broker_clerks
ORDER BY clerk_name;

And this is what the results looked like...

SQL Server 2016: RING_BUFFER_RESOURCE_MONITOR memory information

Doing some additional memory work.  This is a great post from Jonathan Kehayias(@SQLPoolBoy on Twitter) on an information source that isn't discussed often.

Identifying External Memory Pressure with dm_os_ring_buffers and RING_BUFFER_RESOURCE_MONITOR

I grabbed one of the relevant XML records, to see what other good stuff might be in there.

<Record id="13" type="RING_BUFFER_RESOURCE_MONITOR" time="209467780">
    <Effect type="APPLY_LOWPM" state="EFFECT_OFF" reversed="0">0</Effect>
    <Effect type="APPLY_HIGHPM" state="EFFECT_ON" reversed="1">209397985</Effect>
    <Effect type="REVERT_HIGHPM" state="EFFECT_OFF" reversed="0">0</Effect>
  <MemoryNode id="0">

On my system, I didn't see any values for SharedMemory, AWEMemory, or AvailableExtendedVirtualAddressSpace.  MemoryUtilization and TotalPhysicalMemory seem like items I can forgo for now.
I started with Jonathan's query as a base, and after some bumping around, I worked it into this query which gathers the stuff from these XML records other than the 5 items mentioned up there.  Took me a little while to decide how to flatten out the three Effect nodes per record.  And then what really took a while for me to figure out is that each NodeId gets its own ring_buffer_address, so to uniquely identify these XML records it takes Record ID and ring_buffer_address or NodeId.

--based on a Jonathan Kehayias query found at... 
(SELECT ring_buffer_address, DATEADD (ss, (-1 * ((cpu_ticks / CONVERT (float, ( cpu_ticks / ms_ticks ))) - [timestamp])/1000), GETDATE()) AS EventTime, 
        CONVERT (xml, record) AS record
 FROM sys.dm_os_ring_buffers
 CROSS JOIN sys.dm_os_sys_info
rb_record AS
(SELECT rb_XML.EventTime,
        record.value('(/Record/@id)[1]', 'int') AS record_id,
        record.value('(/Record/ResourceMonitor/NodeId)[1]', 'int') AS [NodeId],
        record.value('(/Record/ResourceMonitor/Notification)[1]', 'varchar(max)') AS [Type],
        record.value('(/Record/ResourceMonitor/IndicatorsProcess)[1]', 'int') AS [IndicatorsProcess],
        record.value('(/Record/ResourceMonitor/IndicatorsSystem)[1]', 'int') AS [IndicatorsSystem],
        record.value('(/Record/ResourceMonitor/IndicatorsPool)[1]', 'int') AS [IndicatorsPool],
        record.value('(/Record/MemoryNode/TargetMemory)[1]','bigint') AS [TargetMemory_Kb],
        record.value('(/Record/MemoryNode/ReservedMemory)[1]','bigint') AS [ReservedMemory_Kb],
        record.value('(/Record/MemoryNode/CommittedMemory)[1]','bigint') AS [CommittedMemory_Kb],
        record.value('(/Record/MemoryNode/PagesMemory)[1]','bigint') AS [PagesMemory],
        record.value('(/Record/MemoryRecord/AvailablePhysicalMemory)[1]', 'bigint') AS [Avail_Phys_Mem_Kb],
        record.value('(/Record/MemoryRecord/AvailableVirtualAddressSpace)[1]', 'bigint') AS [Avail_VAS_Kb],
        record.value('(/Record/MemoryRecord/TotalPageFile)[1]','bigint') AS [TotalPageFile_Kb],
        record.value('(/Record/MemoryRecord/AvailablePageFile)[1]','bigint') AS [AvailablePageFile_Kb]
 FROM rb_XML),
rb_effect AS
(SELECT record.value('(/Record/@id)[1]', 'varchar(50)') AS record_id,
        rb_XML.record.value('(/Record/ResourceMonitor/NodeId)[1]', 'int') AS [NodeId],
        elem.value('(@type)[1]', 'varchar(50)') AS effect_type,
        elem.value('(@state)[1]', 'varchar(50)') AS effect_state, 
        elem.value('(@reversed)[1]', 'int') AS effect_reversed,
        elem.value('.', 'bigint') AS effect_value
 CROSS APPLY record.nodes('/Record/ResourceMonitor/Effect') effect(elem)),
rb__APPLY_LOWPM(record_id, NodeId, APPLY_LOWPM__state, APPLY_LOWPM__reversed, APPLY_LOWPM__value) AS
(SELECT rb_effect.record_id, NodeId, effect_state, effect_reversed, effect_value 
 FROM rb_effect
 WHERE effect_type = 'APPLY_LOWPM'),
rb__APPLY_HIGHPM(record_id, NodeId, APPLY_HIGHPM__state, APPLY_HIGHPM__reversed, APPLY_HIGHPM__value) AS
(SELECT rb_effect.record_id, NodeId, effect_state, effect_reversed, effect_value 
 FROM rb_effect
 WHERE effect_type = 'APPLY_HIGHPM'),
rb__REVERT_HIGHPM(record_id, NodeId, REVERT_HIGHPM__state, REVERT_HIGHPM__reversed, REVERT_HIGHPM__value) AS
(SELECT rb_effect.record_id, NodeId, effect_state, effect_reversed, effect_value 
 FROM rb_effect
 WHERE effect_type = 'REVERT_HIGHPM')
SELECT EventTime,
FROM rb_record
JOIN rb__APPLY_LOWPM ON rb_record.record_id = rb__APPLY_LOWPM.record_id AND rb_record.NodeId = rb__APPLY_LOWPM.NodeId
JOIN rb__APPLY_HIGHPM ON rb_record.record_id = rb__APPLY_HIGHPM.record_id AND rb_record.NodeId = rb__APPLY_HIGHPM.NodeId
JOIN rb__REVERT_HIGHPM ON rb_record.record_id = rb__REVERT_HIGHPM.record_id AND rb_record.NodeId = rb__REVERT_HIGHPM.NodeId
ORDER BY rb_record.record_id, rb_record.EventTime 

Monday, October 16, 2017

Powershell: Retrieving external per-process memory info within SQL Server

Laerte Junior (@LaerteSQLDBA on Twitter) has a great post on the RedGate Hub.

The PoSh DBA: Solutions using PowerShell and SQL Server

There's a lot of great stuff available in perfmon, and tons of SQL Server-specific stuff available in the DMVs, DMFs, etc.

But what if you've got SQL Server access without remote connect privileges to the Windows server?  Yeah, that happens sometimes. :-)

That's when executing some PowerShell via xp_cmdshell can become extremely handy.

I started with the code example in the blog post above, and expanded a bit in order to grab all of the process memory information I'm interested in.  This ought to help determine if there's something external putting unexpected memory pressure on SQL Server, or if 'max server memory' is simply set too high.  This process-level information won't answer that question alone - it'll have to be considered together with some information about SQL Server memory gathered from the DMVs and DMFs.  But this is all I have time for today :-)  Note: I'm grabbing the 64-bit measures below.  If you happen to be on 32-bit Windows, you'll want to retrieve the 32-bit equivalents from Powershell (eg WorkingSet rather than WorkingSet64).

I wanted to put this stuff in a table...

CREATE TABLE process_memory
    (capture_tm DATETIME,
     ProcessName VARCHAR(256),
     Id INT,
     PrivateMemorySize64 BIGINT,
     NonpagedSystemMemorySize64 BIGINT,
     PagedSystemMemorySize64 BIGINT,
     PagedMemorySize64 BIGINT,
     PeakPagedMemorySize64 BIGINT,
     VirtualMemorySize64 BIGINT,
     PeakVirtualMemorySize64 BIGINT,
     WorkingSet64 BIGINT,
     PeakWorkingSet64 BIGINT
     constraint pk__process_memory
         primary key clustered (capture_tm, ProcessName, Id))
WITH (data_compression = page);

And here's a stored procedure to grab the info from Powershell, shred the XML into an EAV table, then pivot into the table structure I want.

CREATE OR ALTER PROCEDURE get__process_memory
IF NOT EXISTS (SELECT 1 FROM sys.tables WHERE [NAME] = 'process_memory')
    PRINT 'Target table process_memory does not exist, early exit.'
IF NOT EXISTS (SELECT 1 FROM sys.configurations WHERE [name] = 'xp_cmdshell' AND value_in_use = 1)
    PRINT 'External operations with xp_cmdshell not enabled, early exit.'

DECLARE @powershell_cmd VARCHAR(1000);

--create a table variable for the data to go into, preserving the order of insertion
SET @powershell_cmd = '@PowerShell -noprofile -command "Get-Process | Select-Object ProcessName,Id,PrivateMemorySize64,NonpagedSystemMemorySize64,'
                      + 'PagedSystemMemorySize64,PagedMemorySize64,PeakPagedMemorySize64,VirtualMemorySize64,PeakVirtualMemorySize64,WorkingSet64,'
                      + 'PeakWorkingSet64|ConvertTo-XML -As string"';

--insert the XML into the table, line by line
EXECUTE xp_cmdshell @powershell_cmd;

--now assemble the XML as a string in the correct order
SELECT @XML_as_String=COALESCE(@XML_as_String,'') + theXML 
  ORDER BY theOrder; 
SELECT  @XML_again = @XML_as_String;
;WITH eav_cte AS
  (SELECT --shred the XML into an EAV table along with the number of the object in the collection
     [property].value('(./text())[1]', 'Varchar(32)') AS [Value],
     [property].value('@Name', 'Varchar(32)') AS [Attribute],
   DENSE_RANK() OVER (ORDER BY [object]) AS unique_object
   FROM @XML_again.nodes('Objects/Object') AS b ([object])
   CROSS APPLY b.object.nodes('./Property') AS c (property))
INSERT INTO process_memory
SELECT GETDATE() capture_tm, ProcessName, Id, PrivateMemorySize64, NonpagedSystemMemorySize64, PagedSystemMemorySize64, 
       PagedMemorySize64, PeakPagedMemorySize64, VirtualMemorySize64, PeakVirtualMemorySize64, WorkingSet64, PeakWorkingSet64
FROM eav_cte
PIVOT (MAX([Value]) FOR [Attribute] IN 
       (ProcessName, Id, PrivateMemorySize64, NonpagedSystemMemorySize64, PagedSystemMemorySize64, PagedMemorySize64,
        PeakPagedMemorySize64, VirtualMemorySize64, PeakVirtualMemorySize64, WorkingSet64, PeakWorkingSet64)) AS eav_pivot;

Excellent.  This ought to do nicely.

Saturday, September 30, 2017

SQL Server - estimates outside of the histogram - half-baked draft

Gonna pull together lots of stuff about cardinality estimates outside of the stats histogram here.  It'll take me a while...
SQL Server keeps track of how many inserts and deletes since last stats update – when the number of inserts/deletes exceeds the stats update threshold the next time a query requests those stats it’ll qualify for an update.  Trace flag 2371 alters the threshold function before SQL Server 2016. With 2016 compatibility mode, the T2371 function becomes default behavior.  Auto-stats update and auto-stats update async settings of the database determine what happens once the stats qualify for an update.  But whether an auto-stats update or a manual stats update, the density, histogram, etc are all updated.

Trace flags 2389, 2390, 4139, and the ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS hint operate outside the full stats framework, bringing in the quickstats update.  They have slightly different scope in terms of which stats qualify for quickstats updates – but in each case its *only* stats for indexes, not stats for non-indexed columns that can qualify.  After 3 consecutive stats updates on an index, SQL Server “brands” the stats type as ascending or static, until then it is branded ‘unknown’. The brand of a stat can be seen by setting trace flag 2388 at the session level and using dbcc show_statistics.

T2389 – Introduced in SQL Server 2005 SP1. quickstats update for stats branded ‘ascending’.  Finds just the current max value, and density together with max value is used for estimate outside of histogram range.  Histogram itself is not persistently amended.
T2390 – Introduced in SQL Server 2005 SP1. quickstats update for stats branded ‘unknown’.  Finds current max value for stats branded ‘unknown’, as above.
T4139 – First available in SQL Server 2012 & 2014, released in the CU stream for each.  A significant gap left with T2389 and T2390 is a case where an index stat has been branded ‘static’ based on its update pattern, even though over the longterm trend the highest value is ascending.  In fact, if more than 10% of the edits to an index between stats updates are *within* the histogram range rather than 90%+ above the histogram range, it may be marked as static.  So, with trace flag 4139 an index can qualify for a quickstats update regardless of its brand.  Trace flag 4139 supersedes and exceeds T2389 & T2390 in that manner.  
ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS – This hint was introduced in SQL Server 2016 SP1.  It functions as trace flag 4139 does, but it can only be applied at the query level while trace flag 4139 can be enabled globally.

There are additional details, such as which indexes types are disqualified from quickstats updates, and when and how the new CE handles ascending keys and/or responds to the trace flags and hint.  But this might already be more than you wanted

Tuesday, September 26, 2017

#SQLServer Just How Minimal Can That Problem Repro Get? Bit Column + outdated stats + histogram amendment

Often among the hardest of my decisions is whether I should spend more time trying to simplify a given problem by eliminating additional factors, shrinking the data needed for a repro, etc... or just put all that effort into investigation purely aimed at understanding the behavior. I expect that to be a long-term challenge :-)

I was quite happy with the way this particular one worked out, though. It started as a maze... access violations generated on a SQL Server 2016 instance with dozens of databases. The access violation came from an insert query using a synonym - with the underlying table in another database! (I didn't know that was possible - or *ever* desirable - until I saw it in this context.) The AV was occurring during a multi-step data transfer process and it was hard to isolate the data flowing into and out of the two databases in question. But after some finagling, I got the problem repro pretty doggone small. Reproduced the AVs on several CUs of SQL Server 2016 and on SQL Server 2017 RC2.

Let me offer a disclaimer before the code to produce the AV. I don't *think* this will do any persistent harm other than a terminated session and dumps like these below.  Regardless, if you choose to reproduce the access violation, please only do so on a nonproduction system.  The dump the AV created on my SQL Server 2017 system...

Here's the setup.  Two 2-column tables in tempdb.  One of those tables with a single row.  The other table with 1 row - then stats auto-created - then a second row inserted.

-- error reproduced on SQL Server 2016 SP1-CU2, SP1-CU3, SP1-CU4
SELECT @@version    -- SQL Server 2017 RC2
USE tempdb;

SELECT 1 AS key1, 1 AS key2 
INTO #temp;

SELECT CONVERT(INT, 1) AS key1, CONVERT(BIT, 0) as deleted 
INTO #temp2

-- auto create stats while only 1 row
SELECT * FROM #temp2 where deleted = CONVERT(BIT, 0); 
INSERT INTO #temp2 -- add a 2nd row with *different* value for key1

So that's the setup.

After the setup, the offending query is quite simple.

-- New CE (via T2312 or default) plus (T4139 or ENABLE_HIST_AMENDMENT_FOR_ASC_KEYS) produces an Access Violation
SELECT #temp.key2
FROM #temp 
INNER JOIN #temp2 ON #temp.key1 = #temp2.key1
GROUP BY #temp.key2
HAVING MIN(CAST(#temp2.Deleted AS INT)) = 1

And the result...

Ouch.  That looks really painful.

An interesting sidenote: in the not-too-distant past, histogram amendments via trace flags 2389, 2390 and/or 4139 seemed to have been exclusively the realm of the legacy cardinality estimater.  This Access Violation is a strong indication that is no longer the case.

Filed a Connect item for this.  Fortunately, the workflow and data involved in this were amenable to a schema change to no longer use a column of type bit - allowed side-stepping this error for now.

SQL Server 2017 Access Violation - outdated bit column stats and histogram amendments