Database Publishing
Digest
Database publishing allows content (pages, files, folders, etc.) to be published to an external database for the purpose of performing live reads from a web application or any other type of application. This is made possible by setting up a database transport and a destination that uses it, and makes it possible to publish to the external database by way of that destination – the same technique used in methods of publishing such as FTP or SFTP.
Concept
Requirements for Database Publishing
Database publishing supports only MySQL 5.0+ for external database servers. The default character set on the external database and all its tables and columns must be set to utf8, and the collation must be utf8_unicode_ci.
Types of Content being Published
- Files - Published files include the name, location, path, and metadata of the associated asset in the CMS; the published version, however, does not include the byte content of the CMS.
- Folders - Published folders include the name, location, path, and metadata of the associated asset in the CMS; they serve as containers for collections of other files, pages, and folders.
- Pages - Published pages include the name, location, path, and metadata of the associated asset in the CMS; also includes the rendered content of the default region (page-specific content with any default region transformations applied). Rendered content is published for both structured data and WYSIWYG pages.
External Database
Database publishing directs records to an external database that utilizes a set-up containing five tables: file, folder, page, metadata, and metadata_custom. The file, folder, and page tables contain records that reflect the state of Cascade-managed assets. Records that reflect Cascade Dublin and custom metadata values for those file, folder and page tables are contained in metadata and custom_metadata tables. A few notes on some of the above fields:
- account_id should always be 1
- site_id is an arbitrary value that reflects a particular site to which the record belongs, and is set on the transport
- folder_id corresponds to the cms_id of a folder, and not its id
The default database schema for database publishing can be found here:
http://www.hannonhill.com/downloads/cascade/db-publish-schema.zip
Site IDs
As mentioned above, a site id is an arbitrary number set on a transport to reflect the ‘site’ to which a record belongs. As applications for multiple sites may operate on a single external database, there may be multiple ‘sites’ in that database. The site id should be set accordingly for the various transports. When assigning them to destinations, be sure that items being published are making it to the remote database with the correct site id.
Technical
Troubleshooting Database Transports
If Cascade is unable to publish to an external database, make sure these guidelines are being followed:
- The server name and port are correct (usually 3306 for MySQL)
- A firewall is not blocking access on the above port
- The host that is trying to connect has granted access to the database transport user
- The bind-address parameter in the MySQL configuration is not set to localhost or the loopback address
Useful information in solving these problems can be garnered from the output of a transport or destination test.
Other Notes
Until now, publishing has consisted of moving files to a local or remote file system. Database publishing is more along the lines of synchronizing some portion of the asset hierarchy within Cascade to representative records in an external database. When using database publishing, keep in mind a few of these important items :
- Records inserted during a database publish do not represent the assets in the state in which they would have been published to a file system by simply publishing them. This is particularly true if options are set on relevant targets and destinations that relate to publish-time directory and link manipulation (i.e. remove base folder path, include target path, destination path, etc.).
- Links in page content are not rewritten during database publishing, and should reflect the in-CMS path of the entities the records represent.
- After particular sequences of publishes, deletes, and unpublishes (or lack thereof), the state of the external database should be considered.
For example, a folder containing two pages:
- Folder
- Page 1
- Page 2
The folder, its pages, and their associated metadata would be written to the external database if a user were to publish this folder to a destination backed by a database transport. Page 1 is then deleted and the folder is republished. If the external database was examined, it would be discovered that Page 1 still has records therein. In database publishing as in normal publishing, it is assumed that users will indicate exactly which assets they want unpublished from their destination.
If the same asset is published to destinations using different database transports, this may result in several files, pages, and folders in the external database with the same cms_id. It is important to know what criteria to include in SQL queries to ensure that the correct records are being operated on. The cms_id, the account_id, and the site_id of the assets being operated on should match those on which the user intends to operate. The folder_id of records in the remote database corresponds to the cms_id of its parent folder, and not to the folder_id.
If a user was dealing with a site with id=5 and an account with id=1, and wanted to get the parent folder of a file named ‘koko.png’ in the external database, the file record would appear in the external database like so:
|
id
|
account_id
|
site_id
|
cms_id
|
folder_id
|
metadata_id
|
name
|
path
|
|
9
|
1
|
5
|
92436e0a7f00010100b2eca959237ccd
|
92436dc67f00010100b2eca953a7298f
|
10
|
koko.png
|
site5/images/koko.pn
|
The desired folder record would appear like so:
|
id
|
account_id
|
site_id
|
cms_id
|
folder_id
|
metadata_id
|
name
|
path
|
|
3
|
1
|
5
|
92434dc67f00010100b2eca953a7298f
|
9a436dc67f00010100b2eca234526666
|
10
|
images
|
site5/image
|
In order to select this record, the correct query would be:
SELECT * FROM folder WHERE cms_id='92436dc67f00010100b2eca953a7298f' AND site_id=5 AND account_id=1;
It is possible that there is another record corresponding to the images folder in the external database with the same cms_id, but with a different account_id, site_id, or both. If the expressions restricting the site_id and account_id were omitted from the query above, multiple folders would be returned in the result.

