University BC

University is becoming increasingly irrelevant for the IT industry.

It’s 3 years of full-time study, yet in a month, a talented 12 year old kid can write an app that makes him a millionaire. Course content is always lagging behind, not for lack of pushing by academics and industry, the bureaucracy of the system drags. With content such as teamtreehouse.com on the up, there is potential for real upset in the IT education market. And without any entrepreneurship support, there is no excitement and potential to build something meaningful from nothing. Increasingly universities will be perceived as the old way, by new students as well as industry looking to hire.

I would like to see cadet-ships for IT, with online content and part-time attendance to training institutions for professional development and assessment. Although even assessment is questionable: Students are not provided access to the internet during assessments, which does not reflect any true-to-life scenario. I value a portfolio of code over grades.

I seek individuals who have experience in Single Page Applications (SPA), Knockout.js, javascript, jquery, Entity Framework,

C#, 2D drawing, graphic art, SQL (Window Functions). Others are looking for Ruby on Rails developers. All of my recent graduates have very limited exposure to any of these.

I could be wrong, but if I am right, institutions who ignore such facts are only going to find out the hard way. I know the IT industry has been reaching out to Universities to help keep them relevant, it’s time for Universities to reach back out to the industry, and relinquish some of their control for the benefit of both students and the industry.

Enhanced by Zemanta

IL to SQL

There are some cases where one needs to perform some more complex processing, necessitating Application-side processing OR custom SQL commands for better performance. For example, splitting one column of comma delimited data into 3 other columns:

public virtual void DoEntityXSplit()
{
  var NeedSplitting = db.EntityXs.Where(x => !x.Splitted1.HasValue);
  foreach (var item in NeedSplitting)
  {
    string[] split = item.DelimitedField.Split(',');
    item.Splitted1 = split[0];
    item.Splitted2 = split[1];
    item.Splitted3 = split[2];
    item.Save(); //THIS...
  }
  db.SaveChanges(); //OR this
}

When you run DoEntityXSplit, the unoptimised code may run. However if supported, automatic optimisation is possible derived from the IL (Instruction Language – aka. .Net bytecode) body of the method, when:
i) The ORM (Object Relational Modelling – eg. nHibernate / EntityFramework) supports some sort of “ILtoSQL” compilation at all; and
ii) The function doesn’t contain any unsupported patterns or references, then the raw SQL may be run. This could include the dynamic creation of a stored procedure even for even faster operation.

public override void DoEntityXSplit()
{
  //This is pseudo SQL code
  db.RunQuery("
    declare cursor @NeedSplitting as (
      select ID, DelimitedField
      from EntityXs
      where Splitted1 is null
    );

    open @NeedSplitting;
    fetch next from @NeedSplitting into @ID, @DelimitedField
    while (@StillmoreRecords)
    begin
      @Splitted1 = fn_CSharpSplit(@DelimitedField, ',', 0)
      @Splitted2 = fn_CSharpSplit(@DelimitedField, ',', 1)
      @Splitted3 = fn_CSharpSplit(@DelimitedField, ',', 2)

      update EntityX
      set Splitted1 = @Splitted1,
      Splitted2 = @Splitted2,
      Splitted3 = @Splitted3
      where ID = @ID

      fetch next from @NeedSplitting into @DelimitedField
    end
  ");
}

of course this could also be compiled to

override void DoEntityXSplit()
{
  //This is pseudo SQL code
  db.RunQuery("
    update EntityX
    set Splitted1 = fn_CSharpSplit(@DelimitedField, ',', 0),
    Splitted2 = fn_CSharpSplit(@DelimitedField, ',', 1)
    Splitted3 = fn_CSharpSplit(@DelimitedField, ',', 2)
    where Splitted1 is null
  ");
}

but I wouldn’t expect that from version 1 or would I?

Regardless, one should treat IL as source code for a compiler which has optimisations for T-SQL output. The ORM mappings would need to be read to resolve IL properties/fields to SQL fields. It may sounds crazy, but it’s definitely achievable and this project looks like a perfect fit for such a feat.

Where will BLToolKit be in 10 years? I believe ILtoSQL should be a big part of that future picture.

If I get time, I’m keen to have a go, it should be built as a standalone dll which an ORM can leverage. Who knows maybe EF will pick this up?

Enhanced by Zemanta

Windowing functions – Who Corrupted SQL?

I hate writing sub-queries, but I seem to hate windowing functions even more! Take the following:

select
PR.ProfileName,
(select max(Created) from Photos P where P.ProfileID = PR.ID) as LastPhotoDate
from Profiles PR

In this example, I want to list all Profile names, and also include a statistic of the most recent uploaded photo. It’s quite easy and looks a little bloated, but compared to windowing functions, it is slower. Let’s have a look at the more performant alternative:

select
PR.ProfileName,
max(Created) OVER (PARTITION BY PR.ID) as LastPhotoDate
from Profiles PR
join Photos P
on P.ProfileID = PR.ID

That’s actually quite clear (if you are used to using windowing functions) and performs better. But it’s still not ideal, coders now need to learn about OVER and PARTITION just to do something seemingly trivial. SQL has let us down. It looks like someone who creates RDBMS’s told the SQL comittee to add windowing functions to the SQL standard, it’s not user friendly at all, computers are supposed to do the hard work for us!

It should look like this:

select
PR.ProfileName,
max(Created)
from Photos P
join Profiles PR
on PR.ID= P.ProfileID
Group By PR.ID --or Group By PR.ProfileName

I don’t see any reason why an RDBMS cannot make this work. I know that if a person gave me this instruction and I had a database, I would have no trouble. Of course, if different partitioning is required within the query, then there is the option for windowing functions, but for the stock standard challenges, keep the SQL simple!

Now what happens when you get a more difficult situation? What if you want to return the most recently uploaded photo (or at least the ID of the photo)?

--Get each profiles' most recent photo
select
PR.ProfileName,
P.PhotoFileName,
P.PhotoBlob
from Photos P
join Profiles PR
on PR.ID= P.ProfileID
join (
select ProfileID, max(Created) as Created
from Photos
group by ProfileID
) X
on X.ProfileID = P.ProfileID
and X.Created = P.Created

It works but it’s awkward and has potential for performance problems. From my limited experience with windowing functions and short search on the web, I couldn’t find a windowing function solution. But again, there’s no reason an RDBMS can’t make it easy for us, and again the SQL language should make it easy for us!

Why can’t the SQL standards group innovate? Something like this:

select
ProfileName,
P.PhotoFileName,
P.PhotoBlob
from Photos P
join Profiles PR
on PR.ID= P.ProfileID
group by ProfileID
being max(Created) --reads: being a record which has the maximum created field value

And leave it to the RDBMS to decide how to make it work? In procedural coding with a set, while you are searching for a maximum value you can also store the entity which has that maximum. There’s no reason this can’t work.

It seems the limitation is the SQL standardisation body. I guess someone could always implement a work around, create a plugin for opensource SQL query tools, as well as opensource functions to convert SQL+ [with such abilities as introduced above] to SQL.

(By the way I have by no means completely thought out all the above, but I hope it describes the spirit of my frustrations and of the possible solution – I hope some RDBMS experts can comment on this dilemma)

SQL-like like in C#

Sometimes you need to build dynamic LINQ queries, and that’s when the Dynamic Query Library (download) comes in handy. With this library you can build a where clause using BOTH SQL and C# syntax. Except for one annoying problem. Like isn’t supported.

When using pure LINQ to build a static query, you can use SqlMethods.Like. But you will find that this only works when querying a SQL dataset. It doesn’t work for local collections – there’s no C# implementation.

My Solution

So I mocked up a quick and dirty like method which would only support a single % wildcard, no escape characters and no _ placeholder. It did the job, but with so many people asking for a solution which mimics like, I thought I’d make one myself and publish it Public Domain-like.

It features:

  • Wildcard is fully supported
  • Placeholder is fully supported
  • Escape characters are fully supported
  • Replaceable tokens – you can change the wildcard (%), placeholder (_) and escape (!) tokens, when you call the function
  • Unit Tested

Downloads:

Adding like support to the Dynamic Query Library – Dynamic.cs

I also modified the Dynamic Query Library, to support like statements, leveraging the new function. Here are the steps required to add support yourself:

1. Add the Like value into the ExpressionParser.TokenID enum

            DoubleBar,
            Like
        }

2. Add the token.id == TokenId.Like clause as shown below into ExpressionParser.ParseComparison()

        Expression ParseComparison() {
            Expression left = ParseAdditive();
            while (token.id == TokenId.Equal || token.id == TokenId.DoubleEqual ||
                token.id == TokenId.ExclamationEqual || token.id == TokenId.LessGreater ||
                token.id == TokenId.GreaterThan || token.id == TokenId.GreaterThanEqual ||
                token.id == TokenId.LessThan || token.id == TokenId.LessThanEqual ||
                token.id == TokenId.Like) {

3. Add the TokenID.Like case as shown below into the switch found at the bottom of the ExpressionParser.ParseComparison() function

                    case TokenId.LessThanEqual:
                        left = GenerateLessThanEqual(left, right);
                        break;
                    case TokenId.Like:
                        left = GenerateLike(left, right);
                        break;
                }

4. Add the following inside the ExpressionParser class (the SQLMethods class need to be accessible, referenced library, or copied source code, using for appropriate namespace)

        Expression GenerateLike(Expression left, Expression right)
        {
            if (left.Type != typeof(string))
                throw new Exception("Only strings supported by like operand");

            return IsLike(left, right);
        }

        static MethodInfo IsLikeMethodInfo = null;
        static Expression IsLike(Expression left, Expression right)
        {
            if (IsLikeMethodInfo == null)
                IsLikeMethodInfo = typeof(SQLMethods).GetMethod("EvaluateIsLike", new Type[] { typeof(string), typeof(string) });
            return Expression.Call(IsLikeMethodInfo, left, right);
        }

5. Change the start of the default switch option according to the code shown in ExpressionParser.NextToken()

                default:
                    if (Char.IsLetter(ch) || ch == '@' || ch == '_') {
                        do {
                            NextChar();
                        } while (Char.IsLetterOrDigit(ch) || ch == '_');

                        string checktext = text.Substring(tokenPos, textPos - tokenPos).ToLower();
                        if (checktext == "like")
                            t = TokenId.Like;
                        else
                            t = TokenId.Identifier;
                        break;
                    }

Example

I use this in my own business system, but I preprocess the LIKE rule, as I have quite a few “AI” rules for bank transaction matching. (You can use like statements directly)

There are many ways to cache, here is how I cache a predicate, looping over the set of AIRules in my DB:

RuleCache[i].PreProcessedPredicate = DynamicQueryable.PreProcessPredicate<vwBankTransaction>(RuleCache[i].Filter); //Change the textbased predicate into a LambdaExpression

And then here is how I use it, looping over the array of cached rules:

bool MatchesRule = DynamicQueryable.Where(x.AsQueryable(), RuleCache[i].PreProcessedPredicate).Any(); //Run the rule

Where `x` is a generic list (not a db query), containing the one record I am checking. (Yes, it would be possible to loop over a larger set [of bank transactions], but I haven’t got around to such a performance improvement in my system – I haven’t noticed any performance issues – it’s not broken).

kick it on DotNetKicks.com