Database Publishing
Digest
Database publishing allows publishable Home-area assets (Pages, Files, and Folders) to be published to an external MySQL database. This gives developers of third-party applications a way to access the content in Cascade in a structured, tabular format. Database publishing requires a Database Transport and a Destination that uses the Transport. Content published to this Destination will end up as records in the remote database. MySQL 5+ is the only supported external database.
Concept
Requirements for Database Publishing
- Database publishing works only with MySQL 5+.
- The default character set on the external database and all its tables and text columns must be set to utf8, and the collation must be utf8_unicode_ci.
Types of Content being Published
- Files - Published files include the name, location, path, and metadata of the associated asset in the CMS; the published version, however, does not include the byte content of the file.
- Folders - Published folders include the name, location, path, and metadata of the associated asset in the CMS; they serve as containers for collections of other files, pages, and folders.
- Pages - Published pages include the name, location, path, and metadata of the associated asset in the CMS; also includes the rendered content of the DEFAULT region (i.e. page specific content with any default region transformations applied). Works with both Structured Data and WYSIWYG Pages.
External Database
Database publishing creates records in an external database that containing five tables: file, folder, page, metadata, and metadata_custom. The file, folder and page tables contain records that are mapped from the corresponding assets managed in Cascade. The metadata and custom_metadata tables contain the wired and dynamic metadata for the page, file and folder assets represented in the file, folder, and page tables. A few notes on some of the above fields:
- account_id - should always be 1
- site_id - is an arbitrary value that reflects a particular site to which the record belongs, and is set on the transport
- folder_id - corresponds to the cms_id of a folder, and not its id field in the remote database
The default database schema for database publishing can be downloaded here.
Site IDs
As mentioned above, a site id is an arbitrary number set on a Database Transport to reflect the 'site' to which a record belongs. The site id should be unique for each of the multiple Database Transports using the same external MySQL database to effectively distinguish the content.
Technical
Creating a Database Transport
To create a new Database Transport:
- Go to Administration > Transports
- Click New Transport on the left.
- On the Transport Type selection screen choose Database and click Submit.
- Complete the following fields in the Generalpane:
- System Name- The name of the Transport in Cascade.
- Parent Container- The Transport Container in which the Transport will reside.
- Site Id- An identifier that is attached to each item published using this Transport that is used to differentiate between publishes from different Database Transports.
- Server Name- The hostname of the server to which Cascade will connect.
- Server Port- The port over which Cascade will communicate with MySQL.
- Database Name - The name of the database into which data will be inserted.
- Username- The username used when authenticating with MySQL.
- Password- The password to be used in combination with Usernameto authenticate with MySQL.
- Confirm Password- Ensures that the password entered is correct by requiring that it be typed again.
- Submit.
-
Troubleshooting Database Transports
If Cascade is unable to publish to an external database, make sure these guidelines are being followed:
- The server name and port are correct (usually 3306 for MySQL)
- A firewall is not blocking access on the above port
- The host that is trying to connect has granted access to the database transport user
- The bind-address parameter in the MySQL configuration is not set to localhost or the loopback address
Useful information in solving these problems can be garnered from the output of a transport or destination test.
Other Notes
Until now, publishing has consisted of moving files to a local or remote file system. Database publishing is more along the lines of synchronizing some portion of the asset hierarchy within Cascade to representative records in an external database. When using database publishing, keep in mind a few of these important items :- Records inserted during a database publish do not represent the assets in the state in which they would have been published to a file system by simply publishing them. This is particularly true if options are set on relevant targets and destinations that relate to publish-time directory and link manipulation (i.e. remove base folder path, include target path, destination path, etc.).
- Links in page content are not rewritten during database publishing, and should reflect the in-CMS path of the entities the records represent.
- After particular sequences of publishes, deletes, and unpublishes (or lack thereof), the state of the external database should be considered.
For example, a folder containing two pages:
- Folder
- Page 1
- Page 2
The folder, its pages, and their associated metadata would be written to the external database if a user were to publish this folder to a destination backed by a database transport. Page 1 is then deleted and the folder is republished. If the external database was examined, it would be discovered that Page 1 still has records therein. In database publishing as in normal publishing, it is assumed that users will indicate exactly which assets they want unpublished from their destination.
If the same asset is published to destinations using different database transports, this may result in several files, pages, and folders in the external database with the same cms_id. It is important to know what criteria to include in SQL queries to ensure that the correct records are being operated on. The cms_id, the account_id, and the site_id of the assets being operated on should match those on which the user intends to operate. The folder_id of records in the remote database corresponds to the cms_id of its parent folder, and not to the folder_id.
If a user was dealing with a site with id=5 and an account with id=1, and wanted to get the parent folder of a file named ‘koko.png’ in the external database, the file record would appear in the external database like so:
|
id |
account_id |
site_id |
cms_id |
folder_id |
metadata_id |
name |
path |
|
9 |
1 |
5 |
92436e0a7f00010100b2eca959237ccd |
92436dc67f00010100b2eca953a7298f |
10 |
koko.png |
site5/images/koko.pn |
The desired folder record would appear like so:
|
id |
account_id |
site_id |
cms_id |
folder_id |
metadata_id |
name |
path |
|
3 |
1 |
5 |
92434dc67f00010100b2eca953a7298f |
9a436dc67f00010100b2eca234526666 |
10 |
images |
site5/image |
In order to select this record, the correct query would be:
SELECT * FROM folder WHERE cms_id='92436dc67f00010100b2eca953a7298f' AND site_id=5 AND account_id=1;
It is possible that there is another record corresponding to the images folder in the external database with the same cms_id, but with a different account_id, site_id, or both. If the expressions restricting the site_id and account_id were omitted from the query above, multiple folders would be returned in the result.

