From SQL to SOQL
by Dan Appleman
As a .NET developer looking at Force.com, there are many differences to consider, as Force.com represents a radically different paradigm for software development. There is the development experience, which relies on a high level of abstraction on the underlying platform rather than Visual Studio design surfaces and code generation. There’s Apex, a language that superficially looks like C#, but requires a completely different set of design patterns. And then there’s the database.
For .NET developers, “database” almost always means SQL Server or other disconnected database that uses a variant of SQL. For Force.com developers, “database” almost always refers to the integrated Force.com database that uses a query language called SOQL and a search language called SOSL.
When comparing databases there is a strong tendency to compare database features or query language capabilities. If you are a database expert, you are welcome to do so. But most .NET developers are not database “experts” – we are database users. We write solutions that are tasked with performing database operations, and our goal is usually to accomplish this as effectively and efficiently as possible, while hopefully not making terrible mistakes that would impact scalability or security. In this article, I am not going to discuss database features, so much as developer process. In other words – what are the differences from a software developer’s perspective of .NET/SQL as compared to Force.com/SOQL? What do you need to know to make that transition?
The Joys of Integration
The .NET framework is designed to be database agnostic – to work with any database, though the vast majority of .NET developers use one of the variants of SQL server. While you can use integration techniques to store data externally from Force.com, the vast majority of Force.com applications make use of the integrated database. The fact that the database is integrated has a profound impact on software development.
In .NET, database access involves a variety of classes and components such as the data set, data adapter, command, connection and binding classes. These classes, or the related LINQ to SQL classes, provide the mechanisms for developers to access the database. They have to be flexible to handle a variety of database providers and query languages. The integrated nature of the Force.com database eliminates the need for all of these components and classes – because you only need native connectivity to one database, the programming model can be designed to access the database directly.
One of the challenges in .NET is storing connection strings securely. It is common practice to store connection strings in the web.config file. And while you can use encryption to secure connection strings, many developers don’t bother. In either case, anyone who has access to the server can retrieve the connection strings and access the database. In Force.com, there are no connection strings – database access security is inherent in access to the platform.
Securing a database only starts with securing the connection string. With .NET, databases may or may not have their security integrated with Windows. Even if they do, their security configuration is distinct from that of Windows. You have to concern yourself with security on many levels – the ability to configure the database server, to create or modify databases, to create and modify database tables, and to read or write data within those tables. You have to determine, manage and sometimes implement your authentication scheme. Database security on Force.com is also complex, but in different ways. Authentication is easy, especially if you rely entirely on built-in authentication. The ability to configure the database and modify tables is bound to the same user identities used to access the Force.com platform. Data access is generally more sophisticated, with the ability to secure individual fields based on user profiles, and to control access to individual records in the database based on user, role, or configurable data sharing rules.
In .NET, you also have to be concerned with database deployment. If you manage your own database, it has to be on a server somewhere. That server has to be maintained, secured and periodically upgraded. If you use a hosted service or cloud service, you have to manage the service account and implement and configure the connection to the database. In Force.com, there are no deployment issues – security, access, upgrading and data maintenance and integrity issues are all handled by Salesforce. With the exception of security – which requires learning a very different database security model than the one you use in .NET, the integrated nature of the Force.com database dramatically simplifies database connectivity, access and maintenance tasks. It also impacts the way you program against the database.
Entities, Without the Framework
One of the fundamental challenges faced by any developer using a database on any language is mapping programming concepts to database concepts. As developers, we are accustomed to using object oriented programming – where data is abstracted as objects that can have fields and references to other objects. But databases use tables with data types that may not map cleanly to the language you are using, and while tables have lookup relationships; their underlying mechanism is different from those used in software. In .NET, you can map database data to objects manually, or use tooling and frameworks such as strongly typed datasets or the Entity Framework.
You might think of Force.com as the Entity Framework, without the framework. In Force.com, every database table corresponds to an object that inherits from the SObject object type. Database fields automatically map to their corresponding language types. There are no tools, configuration or XML files required to create the object-relational mapping – it just happens. If you add a field to a database object, that field can be immediately referenced from workflows and code. What’s more, once you have referenced a database field, Force.com prevents you from deleting that field or changing it to an incompatible data type.
The relationship between objects and tables is so tight that in Force.com we don’t refer to tables at all – we only refer to them as objects (or SObjects when discussing code).
The native language of the Force.com database is SOQL (Salesforce Object Query Language). A distinct language called SOSL (Salesforce Object Search Language) is used for searching – providing the ability to search for text values across multiple fields and object types in a single operation.
Unlike SQL, SOQL is used exclusively for querying data. Insert, update and delete operations are performed using a distinct set of methods called DML (Data Manipulation Language) methods. To insert an object, you create an instance of the object using the Apex new statement, then use the insert method to insert it. For example: here is how you can insert a single contact object:
// DML operation to insert a single contact Contact c = new Contact(testfield__c = 'initialvalue', FirstName = 'Joe', LastName = 'Jones'); insert c;
To update an object, you can use a SOQL query statement to load the current value of the object, modify any fields you wish to new values, then use the update method to push the updates to the database. These DML operations can, of course, operate on arrays of objects, much as a dataset update can modify multiple rows in a single update operation in .NET.
There are two types of SOQL statements – static SOQL and dynamic SOQL. A static SOQL statement is a bit like LINQ, in sprit if not in syntax. Here is an example that queries the “Contact” table – which I will refer to henceforth as the contact object.
String searchfor = 'Jones'; Contact contacts = [Select testfield__c, FirstName, LastName from Contact where LastName= :searchfor];
The first line defines a string variable “searchfor” that will contain the last name of the person we are looking for. The second line defines an array called “contacts” that is loaded with all of the Contact objects in the database where the last name is “Jones”. The square brackets separate the SOQL code from the Apex language code. The “:” symbol escapes the SOQL statement to incorporate the value of a variable into the query.
The SOQL statement is compiled. That means that a significant amount of compile time checking takes place. The compiler will verify that all of the requested fields exist in the database. It will also prevent anyone from deleting any of those fields from the database until they have first been removed from the code.
Dynamic SOQL is used when you need to define the query at runtime. The dynamic equivalent of this statement looks like this:
Contact contacts = (Contact)Database.query('Select testfield__c, FirstName, LastName from Contact where LastName= :searchfor');
Static SOQL is almost always preferred over Dynamic SOQL, as Dynamic SOQL lacks the compile time field checking and the protection from field deletion – the platform can’t anticipate which fields you might include in a dynamic string. There are, however, scenarios where dynamic SOQL is preferred, in particular when creating software packages intended for sale to multiple organizations.
Joins Made Simple
Have you ever been confused between left inner joins, right outer joins, and all of the other varieties of SQL join syntax? I’ll be honest, it was never second nature to me. On Force.com object relationships happen… well, almost magically. For example, the standard Contact object can be related to an Account object – it’s a one-to-many relationship where each contact contains a lookup to one account, and each Account can be referenced by multiple contacts.
Let’s say you want to query the name of the account (if any) in the previous example. In SQL, you’d have to specify the lookup relationship using a Join statement. In SOQL, you can just reference the field on the account using “Account.Name” as follows:
Contact contacts = [Select ID, Account.Name, Name, Email from Contact where LastName = :searchfor];
If the contact doesn’t belong to an account, the account name will be null. And it’s trivial to reference the account name of the first retrieved contact in code as shown here:
String theaccountname = contacts.Account.Name;
How about the other way – what if you wanted to obtain the names of every contact where the account name is “XYZ Inc.”
Account accounts = [Select ID, Name, (Select ID, LastName from Contacts) from Account where Name = 'XYZ Inc.'];
The subquery builds the object relationship for you, where each account references and array of associated contacts. You could then retrieve the last name of the second contact on the first account as follows:
String thefirstcontact = accounts.contacts.LastName;
Of course, in a real program you would first check to make sure that an object was found by testing the length of both the accounts array and the contacts array for that account.
As you can see, the object relational mapping capabilities of Force.com go beyond simple object retrieval to include object relationships as well.
In SQL you might make use of stored procedures and indexes to improve performance. Force.com makes use of these concepts internally, but they are largely hidden from you. You create the queries you wish and rely in the platform to optimize your request to the underlying database.
With no stored procedures, DML operations moved into the language, and database management built into the platform, SOQL is simpler than SQL in that it is used only for queries – the equivalent of the SQL SELECT statement. As such, you’ll find using it similar to SQL, with some interesting twists.
- You’ve already seen that there is no need for JOIN statements – object relationships are implicit.
- SOQL supports polymorphic relations. Force.com sometimes allows an object field to reference more than one type of object. Polymorphic relations allows you to specify different fields to query depending on the type of object referenced.
- Aggregate functions. SOQL supports aggregate functions such as SUM() and Count() in much the same way as SQL. It also supports sorting and grouping.
- Date range queries. SOQL has built-in support for use of fiscal quarters and fiscal years. It can do this because it is aware of the fiscal year settings of the organization running the current application.
- Currency fields. Both SQL and SOQL include “money” types. SOQL, however, is aware of both the reference currency of an organization and that of the current user. This makes it possible for SOQL to perform currency conversions for organizations that use multiple currencies.
- SOQL supports both optimistic and explicit record locking.
Generally speaking, you’ll find SOQL very similar to SQL, just easier to use.
The Force.com database supports both automatic and explicit transactions.
The automatic transactions deserve special note. Let’s say you insert an object and have implemented code in the insert trigger for that object type that modifies other records. On SQL, you would have to make explicit use of transaction boundaries to make sure that data integrity was maintained if an exception occurs. On Force.com, the entire insert operation is by default wrapped in a transaction – if an error occurs in any code executed during that insertion, all of the database operations during the insertion attempt will revert.
But the automatic transactioning that is perhaps most useful is the one that takes place during unit tests. Your unit tests can create data and modify data in any way they wish – all changes are reverted at the end of the test. In fact, by default your unit tests see a largely empty database, making it simple to create unit tests that are not impacted by the current state of the database. This even makes is possible to safely perform testing and diagnostics on production systems – your unit tests can’t impact production data in any way.
There are some who look at Force.com as a programmable database rather than a general purpose software development platform. This is a mistake. First, because Force.com is itself a database user – somewhere deep under the hood they are using other database technologies. SOQL is not actually a native database language – the SOQL your write ultimately gets compiled down into native database queries. So arguing that Force.com is a programmable database is technically questionable.
But more important, looking at Force.com as a programmable database is far too limiting – the platform can be used to create applications where the database is incidental, or not used at all. Then again, the seamless integrated nature of the database makes it so easy to use that you might find yourself incorporating it into designs in ways you had not considered before.
About the Author
Dan Appleman is a well-known author, software developer and speaker. He is currently the CTO of Full Circle CRM, A Salesforce ISV partner that helps Salesforce users manage sales and marketing campaigns across leads and contacts and accurately measure ROI. He is also the author of the book “Advanced Apex Programming for Saleforce.com and Force.com”, and several Pluralsight courses including Force.com for .NET Developers and Force.com and Apex Fundamentals for Developers.