<?xml version="1.0" encoding="UTF-8"?>
<rss version='2.0' xmlns:dc="http://purl.org/dc/elements/1.1/"
  xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Raymond Gallagher</title>
    <description>Principal Business Intelligence Engineer at Aspen Tech.</description>
    <link>https://rgdataengineering.silvrback.com/feed</link>
    <atom:link href="https://rgdataengineering.silvrback.com/feed" rel="self" type="application/rss+xml"/>
    <category domain="rgdataengineering.silvrback.com">Content Management/Blog</category>
    <language>en-us</language>
      <pubDate>Fri, 19 Apr 2019 22:40:05 -0400</pubDate>
    <managingEditor>rgdataengineering@outlook.com (Raymond Gallagher)</managingEditor>
      <item>
        <guid>https://rgdataengineering.silvrback.com/database-configuration-management#46959</guid>
          <pubDate>Fri, 19 Apr 2019 22:40:05 -0400</pubDate>
        <link>https://rgdataengineering.silvrback.com/database-configuration-management</link>
        <title>Database Configuration Management</title>
        <description></description>
        <content:encoded><![CDATA[<p>I&#39;ve uploaded a collection of projects and code related to database configuration management for use in managing and reporting on multiple SQL Server instances and databases across multiple environments.</p>

<p><strong>Stargazer</strong></p>

<p>Stargazer is a metadata support solution that also collects information on Sql Agent jobs for reporting.<br>
The job reporting code was written by Ed Pollack at <a href="https://www.Sqlshack.com/tracking-job-performance-Sql-server">https://www.Sqlshack.com/tracking-job-performance-Sql-server</a>.<br>
T-Sql and Microsoft Sql Server are used.<br>
The purpose of Stargazer is to store metadata (for example, server names) once and use it for many solutions.</p>

<p>Fully qualified domain names are supported for managing multiple environments.</p>

<p><strong>Enterprise</strong></p>

<p>Enterprise is a Sql Server configuration management and warehouse solution.  Much of its code is derived from SqlPowerDoc (<a href="https://Sqlpowerdoc.codeplex.com">https://Sqlpowerdoc.codeplex.com</a>).<br>
The code is written in T-Sql and Powershell.  Microsoft Sql Server, Analysis Services Tabular and PowerBI are the products used.<br>
The purpose of Enterprise is to provide information for reporting and infrastructure as a service (IaaS) projects.</p>

<p>Enterprise collects data and loads into Inventory tables within Sql Server.  The following information is collected:<br>
Base runs<br>
Sql Server base runs<br>
Computer information<br>
General configuration information for the Sql instance<br>
Database information including backup history<br>
Service information for the Sql services<br>
Linked server information including options and security configuration</p>

<p><strong>Enterprise Policy Management</strong></p>

<p>EnterprisePolicyManagement is a copy of the solution posted at <a href="https://epmframework.codeplex.com/">https://epmframework.codeplex.com/</a>.<br>
I have rearranged the jobs and consolidated the policies and conditions, along with writing some of my own.<br>
The code is written in T-Sql and Powershell.  Microsoft Sql Server is used.<br>
The purpose of EnterprisePolicyManagement is to provide policy evaluation and collection against a Sql Server estate.</p>

<p>Policies and conditions will vary based on business and technical need.</p>

<p><strong>Voyager</strong></p>

<p>Voyager is a Sql Server configuration management solution that evaluates and visualizes policy evaluation from Policy Based Management and Desired State Configuration.<br>
The code is written in T-SQL and Powershell.  Microsoft Sql Server, Analysis Services Tabular and PowerBI are the products used.<br>
The purpose of Voyager is to visualize policy evaluation and provide a way to set a desired state for Sql Server instances.</p>

<p>Desired State Configuration provides a way to define configurations, test them and set them as state.<br>
Reporting is done in PowerBI on PowerBI Report Server.  The data from Policy Based Management and Desired State Configuration is combined.</p>

<p><strong>Sentinel</strong></p>

<p>Sentinel is a reporting and management application built on top of Policy Based Management for Sql Server and the EPM Framework (<a href="https://epmframework.codeplex.com">https://epmframework.codeplex.com</a>).<br>
It was originally written by James Milner and altered by Raymond Gallagher.<br>
T-Sql and Microsoft Sql Server are used along with reports designed for Sql Server Reporting Services.<br>
The purpose of Sentinel is to provide configuration and policy management for a Sql Server estate.</p>

<p>Sentinel provides an alternative application and methodology to viewing the data available through the EPM Framework.</p>

<p>Please click <a href="https://github.com/raymondgallagher/Database-Configuration-Management">here</a> for the projects and code.</p>
]]></content:encoded>
      </item>
      <item>
        <guid>https://rgdataengineering.silvrback.com/business-intelligence-with-idera-update#38461</guid>
          <pubDate>Sat, 12 May 2018 18:34:51 -0400</pubDate>
        <link>https://rgdataengineering.silvrback.com/business-intelligence-with-idera-update</link>
        <title>Business Intelligence with Idera – Update</title>
        <description></description>
        <content:encoded><![CDATA[<p>In October, I discussed how to set up a business intelligence (BI) solution with Idera data using the Microsoft BI stack.  This includes Integration Services (SSIS), Analysis Services (SSAS) and PowerBI.  Today, I would like to share a minor update to that solution that helped me solve a business problem.</p>

<p>When you have multiple SQL Server instances in Production, one wants to know if it is possible, and feasible, to consolidate different databases from multiple instances onto one machine.  This presents a few variables, the most important of which are business function and performance.  For performance, you need to understand the CPU and memory characteristics for your databases and determine if you will have a detrimental performance impact by putting them together.</p>

<p>Solving this problem can be achieved by running the Microsoft Assessment and Planning toolkit (MAP tool).  This is an excellent tool provided by Microsoft that captures information for analysis in migration projects.  I used the tool to capture performance metrics which were then automatically fed through various statistical programs to produce results (such as 95th percentile) for analysis.  I highly recommend you check it out here if you haven’t already.</p>

<p>In addition to MAP, I saw an opportunity for this project to use the BI solution I discussed earlier.  My goal was to select servers and show an additive figure for their combined CPU, memory and Page Life Expectancy (PLE) values by time.</p>

<p><strong>Tabular Model</strong></p>

<p>First, I had to set up the Tabular Model to display the right measures for the analysis.  In the main fact table, I created or verified a measure for average, maximum and minimum counter values.  This was a simple formula that used the DAX expression AVERAGE, MAX and MIN respectively.  I also created a calculated column “UTCDate” that stripped away the time values from the “UTCDateTime” column.  I did this to make the model easier to deal with, as I didn’t need to slice down to the time level.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/averagemaxminmeasures.jpg" /></p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/utcdate.jpg" /></p>

<p>Next, I created a calculated table that summarized (using the DAX expression SUMMARIZE) the values by UTCDate for the measures created in the previous step.  In effect, this is similar to a Group By in SQL Server.  What we end up with is our average, max and min values grouped by date and SQLServerDatabaseID.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/calculatedtable.jpg" /></p>

<p>Here is the formula used to create the calculated table.</p>

<p>=SUMMARIZE(IderaStatistics, IderaStatistics[SQLServerDatabaseID], IderaStatistics[CounterID], IderaStatistics[UTCDate], “MaxGroupBy”, MAX(IderaStatistics[CounterValue]), “AvgGroupBy”, AVERAGE(IderaStatistics[CounterValue]), “MinGroupBy”, MIN(IderaStatistics[CounterValue]))</p>

<p>Lastly, I created two additional measures for each of our three mathematical functions (average, max, min).  First, I created a measure to sum the values.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/avggroupbysum.jpg" /></p>

<p>Then, I created a measure to calculate the value of all selected filters (DAX ALLSELECTED expression).  This allows me to select the servers in PowerBI and have one value per date representing the summed counter value.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/allselected.jpg" /></p>

<p><strong>Power BI</strong></p>

<p>With the tabular model completed, deployed and active, I can move on to the PowerBI piece of the puzzle.  This is mostly trivial, as the hard work is done in the Tabular Model.</p>

<p>The first report is for CPU.  We are interested in the maximum and average CPU for a server.  The first and second graphs on the top display the average and maximum counter values (I have a page level filter for CPUActivityPercentage from Idera) by date.  These separate the servers according to the filter values (I have them on the left side of the report).  The bottom graph takes those values for average and maximum and sums them for all servers in the filter (Remember our ALLSELECTED DAX function).  By adding a red line for 100, representing 100% CPU usage, as well as a trend line, I can get a good idea of whether the servers I’ve selected in the filter can be consolidated by CPU.  The real power of this is the model will update almost instantaneously based on the filter, so I can try different servers out very quickly.</p>

<p>The bottom graph is a line chart that uses the measures MaxGroupBySum_AllSelected and AvgGroupBySum_AllSelected with the UTCDate as an axis.  Below that is a simple time filter visualization for UTCDate.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/powerbi_cpu.jpg" /></p>

<p>In the example above, the average is clearly fine, however, the maximum is not.</p>

<p>The report for the memory usage is very similar.  It is less useful than the CPU report but nonetheless shows how much memory would be needed for a consolidation.  It is the exact same code save for a different page level filter, this time the SQLMemoryUsed counter from Idera.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/powerbi_memory.jpg" /></p>

<p>Finally, the page life expectancy report shows an average and minimum value (we are not interested in maximum here).  This is a simple report that required none of the extra changes in the Tabular model discussed above.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/69a07922-1fa2-4a4c-9a58-aaf90ccabe2a/powerbi_ple.jpg" /></p>
]]></content:encoded>
      </item>
      <item>
        <guid>https://rgdataengineering.silvrback.com/business-intelligence-with-idera#38460</guid>
          <pubDate>Sat, 12 May 2018 18:27:39 -0400</pubDate>
        <link>https://rgdataengineering.silvrback.com/business-intelligence-with-idera</link>
        <title>Business Intelligence with Idera</title>
        <description></description>
        <content:encoded><![CDATA[<p>This post will talk about how to build a business intelligence solution for Idera SQL Diagnostic Manager data using the Microsoft BI stack of software.  This includes SQL Server, Integration Services, Analysis Services and PowerBI.  As Idera comes with its own suite of software to display dashboards for data, this project as a bit of overlap, however, it’s a good source of data to learn the BI stack and present information in a way that Idera may not be able to.</p>

<p>To get started, we are going to create a data warehouse in SQL Server.  This will consist of some simple tables in a dedicated database modeled after the Kimball design.  For those who are unfamiliar with dimensional modeling I suggest finding some material and studying up on it.  There’s a great deal of information regarding the topic.</p>

<p>We will have one fact table for our measures and three dimensional tables for our modeling.  The fact table will consist of counters collected by the Idera instance (which can be thought of as machine data, by the way), along with the counter type ID, the datetime the counter was collected and the ID’s for the SQL instance and the database, if applicable.  The counters themselves are the measures, values that will be calculated into averages, maximums and minimums for our analysis.  All four other columns (the datetime, SQL Server ID, database ID and counter ID) will be used by the dimensional tables for modeling and slicing.</p>

<p>The three dimensional tables are for the counter types, SQL Server and database instances and the datetime values.  The DimCounter table holds the counter category and name while the DimSQLServer table holds the SQL Server and database names.  The DimDate table is mostly a calculated table.  It stores the datetime corresponding to the measure in the fact table, along with calculated columns representing attributes such as year, month, day and others, derived from the datetime value.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/dimensionalmodel.png" /></p>

<p>Notice that there is a defined primary key for each of the dimensional tables and a foreign key from each of the key columns in the fact table.</p>

<p>At the end of this post I’ve included the code for creating the database and the tables.  I also create a staging table and an error table for use later.</p>

<p><strong>Extract Transform Load (ETL)</strong></p>

<p>The next step is to design the ETL package to use for getting data into the dimensional model.  For this I use SQL Server Integration Services (SSIS), a tool designed for this purpose.  I have connections defined for both the Idera database and the data warehouse itself, and I use one package as part of a master project.  This package loads the dimension tables first and then the fact table.  The date table is seeded from the fact table.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/ssispackage.png" /></p>

<p>The most important thing to understand about SSIS is what it does well vs. what it doesn’t do well.  Generally, you want to perform set operations within the SQL Server database instance.  Although I don’t show it, I perform lookups for the DimCounter and DimSQLServer dimension tables in SSIS, because the data volume is not large and SSIS can handle errors and data flow paths very well.  For FactIdera, however, I use the staging table as a fast import for SSIS to dump data, and then use the SQL MERGE function to merge data into the real fact table.  This processing on the database side saves a great deal of processing time that SSIS would have to do.  Depending on how much memory you have on the machine running the package, it would take minutes to hours to finish if SSIS had to cache the data to load into the fact table.</p>

<p>The DimDate table is seeded from the fact table, as I mentioned earlier.  Because Idera is constantly writing data to its database, importing to the date table before importing to the fact table can result in referential integrity errors if your package runs for a long time.</p>

<p>The main data flow part of the package for the FactIdera table is below.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/ssispackage_factidera.png" /></p>

<p>The hardest part here to understand is the data types for the data you are working with.  If SSIS cannot convert between data types in various tables, it won’t let you perform operations like lookups and joins.  Solving this requires the use of data conversion tasks as you can see above.  Generally, I find that using a unicode data type (NVARCHAR in SQL Server) for character data is easiest to work with, but it depends on your data.</p>

<p>Once you have the package done and it runs without error (via debugging), you can deploy it to your target SQL Server instance and schedule it using a SQL Server Agent job.</p>

<p>I’ve included the SSIS package below in the download link.</p>

<p><strong>Online Analytical Processing (OLAP)</strong></p>

<p>In the Microsoft world, SQL Server Analysis Services (SSAS) is the tool for performing the next part of our project, that being preparing the data for OLAP processing.  OLAP processing, and cube building, as it’s also referred to, pre-configures many of the common operations we want to do against the data, including sums, maximums and minimums, averages and counts.  This is part of what makes this method of reporting flexible and fast.  The cube does some of the work work up front, so that when it’s presented as a data source, it’s optimized for reporting.</p>

<p>Two different ways of OLAP processing exist.  One is called multidimensional and is the traditional model.  The other is the tabular method and is newer.  Both serve roughly the same purpose and have pros and cons.  For our purposes, we’ll take a look at both.</p>

<p>There are various resources out in the wild that can teach you more about each method and when to use which one.</p>

<p><strong>Multidimensional</strong></p>

<p>The multidimensional model involves creating a data source and view (which is just a view of your tables), as well as dimensions and cubes.  For the dimensions, it’s almost a view of your table, but you have some additional tasks to complete.  One important task is to create hierarchies, especially for your date dimension.  This is what will enable you to drill down in your reports.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/dimdatehierarchy.png" /></p>

<p>Besides hierarchies, attribute relationships can also be created, although I don’t do it here.</p>

<p>Once you’ve loaded your dimensions, you can build and process them, which you should do to see if there are any errors you need to correct.  SSAS allows you to browse your data in the application, but this is mostly for checking to make sure your data is presented as you intended.</p>

<p>Most of your design work will be within your cube.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/ssascube.png" /></p>

<p>The two key aspects here to review are measures and calculations.  There are others as you can see, but we’ll focus on measures and calculations.  Measures are where you define your sums, maximums and minimums, etc.  You should have as many measures as you need for your reports.</p>

<p>Calculations are where you can control the leaf cells of your cube.  For us, we can define an average calculation, as average is not included in the list of measures you can choose from in the main page.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/calculations.png" /></p>

<p>Since the average is a measure of sum divided by count, it’s fairly easy to express.  The multidimensional model for SSAS uses a language called MDX, which you can find resources online to learn about if you so desire.</p>

<p>Once your cube is done, you can process it to view any errors, and after that, you can deploy your cube to an Analysis Services instance of SQL Server.  You actually have to use a deployment wizard buried in your start menu folder structure to do that.  There’s no way that I can find to deploy it from the application (it will deploy locally).</p>

<p>I’ve included my multidimensional project in the download below.</p>

<p><strong>Tabular</strong></p>

<p>In response to the ever decreasing cost of storage and memory, and based on the somewhat steep learning curve of the multidimensional model, the tabular model was born.  This is simpler to learn and is run by the same code base as the Microsoft Excel PowerPivot package.  If you know one, you know the other.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/tabular.png" /></p>

<p>Instead of cubes and dimensions, now we have something that looks a lot like Excel.  As with multidimensional, we still have to create hierarchies, and as an added bonus, we have to create a concatenated column to join DimSQLServer to FactIdera, as the tabular model does not support joins on more than one key column.</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/tabularhierarchy.png" /></p>

<p>Other than those two things, it’s pretty straightforward.  Deployment is actually specified in the beginning when you create your project.  You can connect to a tabular instance or you can continue as an isolated project.</p>

<p>The biggest gotcha with tabular is that it’s an in-memory application (as in, your data is processed and presented in-memory).  You can spool to disk, but performance will degrade.  If you have a model with terabytes of data, tabular may not work for you.</p>

<p>I’ve included my tabular model project in the download below.</p>

<p><strong>PowerBI Reports</strong></p>

<p>The final part of this project is the fun part.  PowerBI is great to use and since you’ve spent all this time creating a dimensional model, your reward is the data will respond fast to your queries.<br>
There are many resources out there to learn PowerBI, but the best suggestion I have for you is to play and work with it yourself.  Eventually you’ll get the hang of it.</p>

<p>Connecting to a data source is easy.  We can choose either our multidimensional model or our tabular model.  It doesn’t matter in our case, but I chose the tabular model.</p>

<p>After successfully connecting you’ll see your measures and dimensions on the right side, and you can start dragging visualizations as you please.  With some practice, you’ll be able to create some pretty cool looking reports (I’ve hidden the slicers on the right showing server and database names)</p>

<p><img alt="Silvrback blog image " src="https://silvrback.s3.amazonaws.com/uploads/284df9a8-28b7-48f8-ae86-012c7430ea34/powerbi.png" /></p>

<p>Notice that we are connected live to the data in the bottom right corner.  That means each time we click a slicer option or click on a report (to drillthrough), the data will update in real time.  PowerBI is based on the technologies from PowerPivot, PowerView and PowerBI in Excel, and will use the DAX query language to reach out to the tabular model.</p>

<p>After having worked with it for a little while now, the best advice I can give you regarding PowerBI is to take advantage of your filters.  You have report, page and visual filters, and they take affect in that order.  If you try to work with all your data at once, you’ll start running into issues as the data won’t be able to fit on one screen.</p>

<p>That’s it for this project.  Please click <a href="https://github.com/raymondgallagher/Business-Intelligence-With-Idera">here</a> for the projects and code.  I’ve also included a brief <a href="https://onedrive.live.com/?authkey=%21AAPwE%2DQQIahFVYo&id=F44DAEF88DAE3258%21109&cid=F44DAEF88DAE3258">presentation</a> as well.</p>
]]></content:encoded>
      </item>
  </channel>
</rss>