Multitenancy Webinar FAQ

This is an FAQ for the Multitenancy Webinar.

Multitenancy Webinar FAQ

Contents

Q: What's the total number of tables and indexes in your multitenant schema?

A: A few hundred.


Q: If you were to design this all over again -- knowing what you learned since the start -- would you use a non-relational datastore for some parts?

A: We do use file based storage for some objects like attachments.


Q: Is Apex run using the JRE?

A:No, there is a custom virtual machine that executes all Apex Code, this allows us to manage limits and record locking for optimized performance.


Q: Is your full text search engine home-grown, Oracle's, or from another source?

A: We use Apache Lucene.


Q: Is there ever a need for service outages?

A: A few times a year we take planned downtime to perform service upgrades.


Q: Does the platform support parametric search?

A: Yes. We provide multiple filter criteria with our full text


Q: Do you use MySQL for anything?

A: No

Q: How do you build your objects' GUIDs (how long are those?) and how do you avoid clashes?

A: They are 15 character ids. We have a table that maintains the ""next id"" available and lock that row when allocating new ids.


Q: How 'special' is the Salesforce CRM application on your platform? i.e. do the CRM programmers have tools available to them which are not currently available to us?

A: The CRM programmers are allowed to program in the base platform (written in Java), however we are trying to convert *everything* to using Apex and Visualforce


Q: Pn the table that contains values, is it limited to 500? or was that just an example? What is the max number of values for a row?

A: 500 is the limit, though some entities (e.g. activities) have smaller number of fields


Q: Can a user have permissions to access more than a single OrgID?

A: No


Q: is there a single db for all your customers? If not, how many firm data is stored in a single db/hardware stack.

A: We have approx 10 dbs for all customers and each db stores tens of thousands of customers


Q: Do you store the lucene indexes in the database?

A: no, on a file system


Q: What application and web server do you use?

A: Resin


Q: Do you allow Federated SSO? (SAML)

A: Yes! 1.1 and 2.0


Q: How many machines go in a pod? do the machines each have a particular job in it's tier or do machines have functionality across tiers (e.g. a machine has app and data and messaging all in one)

A: approx 50 machines, each dedicated to a broad set of functionality (app tier, db tier, file serving)


Q: Could you talk about how Org wide record sharing is maintained from the data perspective.

A: We maintain sharing tables that explicitly grants access from a record to a user or group.


Q: When will you support more than one level of master-detail, i.e. parent, child, grandchild?

A: Yes. This is on our short term roadmap


Q: Do you leverage Oracle's Virtual Private Database feature? If not, why?

A: Virtual Private Databases would require a different Oracle user per tenant, which isn't scalable (you can't reuse connections, etc)


Q: do you use bitmap indexes

A: no


Q: How do you manage your metameta data is it just the db schema of the metadata tables; can different tenants have different metamodels; how does the App discover the metamodel

A: we maintain our own data tables that describe the metadata and every tenant can have a different metadata model.


Q: What do you use for monitoring?

A: Lots of technologies. Nagios, Splunk, Homegrown


Q: According to your doc, Apex code when triggered as web service does not obey sharing rules - is this correct? If so what is the recommended way to write Apex that can be triggered by Ajax that obeys sharing?

A: Use ""with sharing"" on the class. i.e. public with sharing class Foo


Q: Is it possible to write generic code in Apex Code? For instance is it possible to pass as a parameter to a function a void pointer or a parameter which is of an object type that is a base class to all other object types?

A: Yes, look at the SObject class

Q: On our custom objects can we specify indexes on single as well as multi fields?

A: Only on single fields. No composite indexes.


Q: If we'd like to replicate our database locally, should we define all our local database tables as text fields just like they are in your database?

A: no need, that would reduce performance in your local db, you know the field types that you need, and you query them by type to get a local copy


Q: If we can't create multi-field indexes how can performance improvement on queries involving multiple field criteria be achieved?

A: you can't and it's limiting right now.


Q: Silly question, but how is it that SFDC has maximized Oracle so well, and Oracle has done a miserable job of developing a competive CRM to SFDC?

A: on premise is hard and no longer appealing to customers


Q: With Tables partitioned by Org-Id, are not some partitions way larger than others (diff customer data size) cause imbalance and subsequently performance impact.

A: Partition size may vary. But the performance trick is that we maintain our own db statistics per OrgId that allows us to drive queries more efficiently. That will be addressed later in the presentation.


Q: How do you ensure applications do not hog resources by doing extensive computations. Do you evict applications that take a long time to respond?

A: We have governors in apex, rate limiting on our app servers, and metadata limits


Q: is there encryption available for security sensitive fields, like passwords?

A: Yes. We take proper precautions with passwords, and we offer encrypted custom fields to our customers


Q: is the data replicated across all data centers? just in case of a natural disaster

A: yes, this is done


Q: what's your data backup strategy?

A: http://www.salesforce.com/saas/questions-about-saas/


Q: if I build a custom app on force.com...do my customers need to purchase salesforce.com licenses in order to purchase licenses for my app?

A: yes, we sell you an oem license called platform


Q: Is Salesforce's Sharing Model integrated with this pre-query optimizer?

A: Yes it is. The # of records that the user can view influences the query plans that are generated.


Q: Is there a hard limit to the number of columns in the data table? your example showed 500 (501) values. that corresponds to the max number of custom fields on an object today. does this mean there is a hard maximum on the number of custom fields a salesforce object can ever have?

A: Currently, force.com has a hard limit of 500 fields, however it is on the roadmap to support unlimited custom fields by chaining those table together


Q: How do you back your report queries? Do you use a BI platform? Are you replicating/ETLing to a different DB for report queries?

A: we run reports on the main database


Q: Are updates to the pivot tables performed in the same transaction as the updates to the data tables?

A: Yes.

Q: Why not the decision to have separate tables for each Object?

A: We cannot perform DDL for customer operations and maintain multitenancy efficiently.


Q: You discussed value0 in the data table, what are the other 499 value fields used for?

A: The other 499 fields are associated with the table


Q: Besides real-time replication, do you do off-site backups or archives?

A: yes, we do


Q: What's the difference between the NA and EMEA instances of salesforce.com? Can a developer on Force.com request his/her application to run on a specific instance?

A: The difference is only in when we perform system maintenance. They use the same code, hardware, etc.


Q: Why is there downtime during maintenance and upgrades?

A: For Oracle schema modifications and upgrade scripts to transform to new schema


Q: Will there ever be support for composite indexes or external ids?

A: We provide support for external IDs, but not composite indexes


Q: How do you ensure security with shared tables?

A: Every query to the database *must* include the OrgId, or else it is an error.

Q: If I develop apps on force.com. What is my maintenance burden during platform upgrade?

A: Pretty much nothing. We maintain complete compatibility for every upgrade. We will never break your code.


Q: Have you had any trouble with convincing organizations of the effective security provided by your multitenancy model on virtual databases? Your approach may not stand up to some of our client's security requirements (which are based on more traditional models, required physical separation of data in a multi-tenant environment)?

A: more and more organizations are understanding our model, and understand the total security provided by this model


Q: Do all the values in the data table have entries in the index table? If not, do we control that at all?

A: Not every value is indexed, only those that would benefit from indexing. There is a cost to maintaining those indexes, and you do have control over it (turn the fields into External IDs)


Q: How do you manage Oracle statistics?

A: we have a performance team that analyzes stats changes and manually applies them


Q: In what language is SF's kernel written?

A: Java & PL/SQL


Q: considering there is a hard limit of 500 fields/columns (Val0...Val500), how do you track deleted fields? Do you reuse these Val# columns after a field has been deleted? If so, at what point do you delete the content of these fields?

A: The data for deleted fields are maintained in the val* columns. It's wiped clean when either: 1. The customer hard deletes the field in the UI 2. After 30 days


Q: Are the lucene indexes clustered? they must be really big! how do you deal with that?

A: we have one or more indexes per customer


Q: Do you replicate tenant data? What is the replication arch? Do you replicate across data centers?

A: Yes, across data centers and within the datacenter. We use Oracle DataGuard and Hitachi TrueCopy


Q: Do you lose customers by not having data on separate databases, and if they really insist, what do you do ?

A: On occasion yes, but the benefits of maintaining the multitenant architecture outweighs the costs


Q: Do you use direct SQL from the application point or use any ORMs ?

A: We have a home-grown ORM for simple transactions and direct SQL and PL/SQL for complicated operations (e.g., forecast calculation)


Q: Do you use something like Hadoop to accomplish distributed search across your Lucene indexes, or something custom built?

A: custom built


Q: How high of a priority is it for you to integrate a true BPM engine, how far out do you see this happening?

A: I can't give a firm answer. But we are looking for broad workflow enhancements in the next few releases.


Q: How do you handle tenant sandboxing?

A: I'm not exactly sure what you mean by this, but every database interaction *must* include the OrgId


Q: How can I get SAS70 cert for your data center?

A: contact your account executive


Q: Can a Customer request a restore of their ORG's data?

A: yes, contact support


Q: Do you support Unicode at the field level?

A: Yes. Our data is encoded (and maintained) in UTF-8 and we support multiple encodings for import and export. We also support localized entry and various functions (like Upper/Lower)


Q: How do you prevent SQL Injection and Cross-Side Scripting?

A: we use prepared statements exclusively for sql, and we have implemented various anti-xss measures in our UI framework


Q: My assumption is that all AppExchange data/configurations fold into these core tables, correct?

A: yes, all as meta data

Q: Dpes the platform allow RESTfull interactions

A: Yes, this is enabled by Apex callouts


Q: A couple years ago Salesforce went thru uptime issues (especially at release time) and re-architected the app. this is around the time trust.salesforce.com came out. which pieces of what is being discussed were implemented in that re-architecture. thus, which of all this are big ""lessons learned"" more recently.

A: By and large the architecture discussed in this talk has been around since the beginning; the rearchitecture was more around the Oracle hardware layer and higher (middle tier and web tier) layers


Q: What's the total number of tables and indexes in your multitenant schema?

A: approx 700 tables and 2300 indexes


Q: What steps do you take to ensure that data from one org cannot be accessed by another org?

A: Every query to the database *must* include the OrgId, or else it is an error.


Q: What steps do you take to ensure that data from one org cannot be accessed by another org?

A: Every table has an OrgId column. At compile time, we enforce constraints that every query must have a filter on that column. Simplistic solution, but it's effective.


Q: Theoretically, could you have a customer who creates so many data records that it starts affecting performance? If so, what would that number be?

A: We have appropriate governors in place to prevent this. Rate limiting across all of the app servers for a particular user, metadata limits, apex governor limits, etc.


Q: What programming languages are used for the middle and higher level tiers?

A: Java


Q: Can we increase web service response size from 100 K to 1MB?

A: This is planned , yes


Q: Yes, History tracking fields.

A: Yes, there will be a performance hit, but it should be minor. There's more processing taking place at runtime and larger data volumes in the history table for your organization.

Q: Is the query optimizer written in ProC?

A: Java


Q: What do you use for caching?

A: We have our own caching solution for metadata, but do use memcached for some things


Q: Do you use or have you considered Oracle's XML query/XPATH capabilities for denormalizing data in XML documents and running queries against them ?

A: We don't


Q: Do you partition data using Oracle partitioning by OrgId

A: yes


Q: Do you user oracle row level security aka VPD ?

A: No, we have our own implementations of similar functionality


See the Multitenancy Webinar page to view the webinar again.