This month’s T-SQL Tuesday is hosted by Erin Stellato, asking participants to talk about “A Day in the Life”.
I had been ignoring this event all this while and it’s because I had other things on my plate, and this time I had decided to get over it. What kicked me was the last T-SQL Tuesday Event “T-SQL Tuesday #31” where I saw so many tweets, several posts showing up on my blog roll mentioning T-SQL Tuesday. Well my curiosity kicked off and from then on I dug to find out who started this, what happens in here, how many have I missed in my life time and lot more other things. Alright, I better stop my usual blah, blah and get to the topic.
“A Day in the Life” brings multiple thoughts into my mind. It starts off with happenings in personal and professional life, and I shall keep my day’s sharing to professional life. I chose one of the busiest days of the week, 7/12, Thursday.
- Was I literally doing this, on this day?
Well at least I didn’t have multiple monitors to watch, but had multiple tasks waiting for me. Some of these were known and some from unknown corners. Whichever, I had to close some of them and continue doing on others because I know it would take more time. Alright here are few of those I had to attend during the day.
LINQ to HPC solution
This is one of the interesting and challenging parts of the day which I planned well. You know anyways that whatever you plan never happens, however I managed reaching a conclusion on what I had to do.
I had been trying to have one of the sample solutions of LINQ to HPC hosted on Windows Azure HPC Cluster and see if I could invoke Job Scheduler through Job Manager on the Head Node executed. Viola, I could configure the Head Node, Compute Nodes and the Front End Node. But, I couldn’t work on invoking the job because of firewall issues between the systems. That was partly done but no clue on where it was failing, so had to sit with team mate to figure out if he had some other ideas. Sometimes my ideas die as duds, so had a fallback option. Lucky me he was around and I was able to offload a part of that problem to him, to fix it since he had done this earlier. Great! Delegation of work, smartly just because I am not too familiar with such implementation and didn’t want spend time hunting for fixes when I have solution around.
Wait, my so called smartness was held up because of an urgent need on something which I never thought that it would come by and stop staring at me. Before I switch to the next task I had to be on, let me share the links on LINQ to HPC solution on Windows Azure HPC cluster. Here are the few for your reading and probably you could make a guess on what I was doing,
I was trying to evaluate on how parallel computing of different data sets is happening on the HPC cluster. At the end I had to understand nuances (limitations, workarounds, constraints, risks, fallback) of implementing such solution.
I know people with familiarity to LINQ to HPC would ask me “Why on the earth you will want to try a framework that has no more releases?” Well, it’s for a specific requirement and we had to evaluate this.
Now a context switch!
Data Trends in the market using Microsoft SQL Server
I had been quite involved in evaluating solution for LINQ to HPC, and then a surprise task for me. It isn’t that I am not familiar with it, but required a bit of thinking. The assignment was to share latest Data Trends in the market for SQL Server. Nice topic, and then came flood of thoughts. This, that, there, what, where, when and then I felt that thoughts have started vanishing. At the end I managed bringing up a list of topics that are most happening with SQL Server, and there is a very bright roadmap for yet-to-come features.
Here is a list that I felt that is making great show and will make the difference in future.
1. Data Quality & Master Data Management
2. Semantic Data Model
3. In-Memory Processing [Speed-of-thought analysis]
1. Self-Service BI
2. Mobile BI
3. Cloud BI [Data Mining and Reporting on cloud]
4. Pervasive BI
Big Data with High Performance Analysis
1. Hadoop and MapReduce on Windows Azure
I am not going to detail in this blog post on why are these data trends. I managed to gather some good references over here for further reading on Mobile BI, Cloud BI and Big Data with High Performance Analysis.
This done, I had to switch over again to another pending task that came by. In this case I didn’t have an option of pushing it further because there were folks waiting right behind my chair and looking over my shoulder to get this done.
Evaluating a data model for a Social Media app
This was a little challenging over the earlier tasks because this was to do with Social Media App. I can’t think of data modeling a database in the lines of an OLTP or Decision Support System. I had to make sure that data model can scale and perform per non-functional requirements.
So here are few design principles I had picked after reading few articles, blog posts, e-books, understanding the behavior of users using social networking app, a bit of my experiences and so on.
At a level of Logical Data Model:
1. Keep the entities to the best of the de-normalized state and I would think of 2 NF /3NF.
2. Avoid lookup or reference entities unless there are lookup values running into numbers more than 10.
3. Keep the entities that are readable and writeable separate to the best possible extent.
At a level of Physical Data Model:
1. Keep the size of the row as low as possible.
2. Avoid use of data types in the column definitions which will contribute to the size of the row. I typically prefer all numeric data types for apps that have requirements with high concurrency.
3. Identify the partition key and partition the table as much possible. Ensure that partition rules are met where you don’t over partition a table, or apply partitioning strategy to a table that doesn’t require one.
4. Index the partition key (goes without saying), and have appropriate covering indexes on columns that are most sought by the queries or that are referenced quite often.
At a level of code:
1. Use appropriate predicates against the DML statements including SELECT. This will help the query engine to choose most optimized path to fetch data for referencing or updating.
2. Don’t use transaction isolation levels to the best of possible extent since we are not having any business critical data in this social networking app, except the features that are commonly known.
All these sound very generic and well known; however there are always chances of losing the steam of importance while creating tables or writing code. At the end of this short topic write-up I realized that most of the rules I have listed are well applicable in OLTP systems too, but there are few which are for a social application.
The intent of having simple, known principles re-emphasized was to bring in concurrency with high performing and scalable design for a multi-tenant environment. I could think of avoiding excessive blocking of processes and have high concurrency by reintroducing these rules.
Well this is too much of theoretical, and sometime I shall blog on some of these rules showcasing proof of concepts.
At the end of this I thought I should share some nice links which I discovered and made me write on these.
1. http://mikesdataviews.blogspot.com.au/ - A great blog I felt personally on data modeling thoughts. Mike started this blog recently and has already written bunch of nice posts, and this is one of the blogs I have marked under watch category.
2. http://www.sei.cmu.edu/library/assets/defect-prediction-techniques.pdf - A white paper on how to empirically evaluate a data model quality around few important parameters.
3. http://sqlserverpedia.com/wiki/Understanding_the_StackOverflow_Database_Schema - A short nice article by Kevin Kline giving insight on how Stack Overflow Database Schema has been designed to handle high concurrency and provide performance.
[Schedulers switched off]
The day ended with this and it was pretty busy as expected, nevertheless I had the satisfaction that I could learn something new and share bunch of my learning’s with like-minded people in the community through T-SQL Tuesday. I think I played the role of “Database Architect” which is my official title in my organization.
Please feel free to comment, share your thoughts, experiences.
PS.: My blog posts are my thoughts, and don’t reflect any views of my company.
Republished from SQLServer-QA.net [10 clicks].
Read the original version here [3 clicks].