Pentaho Business Intelligence Suite 3.7
A guide to getting started with MySQL 5.x and Windows
Table of Contents
Configuring Hibernate and Quartz
Configuring Apache-Tomcat Server
Configuring SMTP (mail server)
Configuring the Administration Console
This work is licensed under a Creative Commons Attribution 3.0 Australia License.
This tutorial is accessed by thousands on a monthly basis, from most of the feedback many found it extremely helpful! But contrary to belief I do not work for Pentaho and all of this work is voluntary, so even $1 can help me with producing bigger and better tutorials!
You can donate to my PayPal account by clicking here.
To use this guide It is assumed that readers have intermediate to advanced knowledge in their setup of choice and basic knowledge of Pentaho (although it is not needed). The following operating systems and databases are supported:
Windows *
MySQL 5.x *
PostgreSQL 8.x.x
Oracle 10g & 11g
Microsoft SQL Server 2005+
Linux
MySQL 5.x
PostgreSQL 8.x.x
Oracle 10g & 11g
* This tutorial is for Linux and MySQL 5.x setup.
Don't
forget about the other hardworking projects which are part of the
Pentaho community and also
deserve a donation:
PAT
(Pentaho Analysis Tool)
An
alternative to Pentaho's current OLAP analyser tool, JPivot.
CDF
(Community Dashboard Framework)
A
framework for building dashboards within Pentaho's Business
Intelligence Server User Console.
CBF
(Community Build Framework)
Is
an ant build.xml file script and alternate way to setup and deploy
Pentaho based applications
CDA
(Community Data Access)
A
data access layer for CDF
(Community Dashboard Framework).
Thanks
to the following blogs, individuals, companies and
groups:
##pentaho
& ##pentaho.pat
IRC
channels found on Freenode
(Pentaho and Pentaho PAT).
Pentaho
Wiki
& Pentaho
Forums
The
first place any new user to Pentaho should look.
Open
Source Business Intelligence
Provided
a working copy of the sample database for MySQL.
Bizcubed
Provided
a working copy of the sample database for PostgreSQL - they are also
Australian!
Pentaho
Solutions: Business Intelligence and Data Warehousing with Pentaho
and MySQL
A
book by Roland and Jos van Dongen.
Installing and Configuring Java
The Pentaho BI Platform requires a JVM (Java Virtual Machine) to be installed on your PC or server. To check if Java is already installed issue the following command (seen in bold) at the command prompt:
C:\>java
-version
java
version "1.6.0_13"
Java(TM) SE Runtime Environment
(build 1.6.0_13-b03)
Java HotSpot(TM) Client VM (build 11.3-b02,
mixed mode, sharing)
If
a similar output (seen above) is displayed Java is already
installed. If not, to install Java on Windows you will need to
download the Java installation file from the Sun Developer Network
downloads
page.
The
next step is to check if the JAVA_HOME
environment variable is setup correctly, issue the following command
(seen in bold) at the command prompt:
C:\>echo
%JAVA_HOME%
C:\Program
Files\Java\jdk1.6.0_13
If a similar output (seen above) is displayed the JAVA_HOME environment variable is already setup. To setup the JAVA_HOME environment variable right click on My Computer and click the Properties option then the Advanced tab and click the Environment Variables button.
Depending
on your setup (User
variables
or System
variables)click
on the New
button to create a new Environment
Variable
(in this guide I will be adding them for the user). For the variable
name
enter JAVA_HOME
and for the variable
value find
the location of your Java installation in this example it is
c:\Program
Files\Java\jdk1.6.0_13:
The CATALINA_OPTS environment variable should also be set to tell the Apache-Tomcat server to use more than the default memory, to do this follow the same steps from above but this time make sure you set the variable name to CATALINA_OPTS and the variable value to -Xms256m -Xmx768m -XX:MaxPermSize=256m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000:
From now on every time the PC or server is started/restarted the JAVA_HOME and CATALINA_OPTS environment variables will be set automatically.
Installing and Configuring MySQL 5.x Server
Click
here for a Windows
guide.
You are able to deploy the platform in many different ways but in this guide I will explain how to deploy it with the packaged Apache-Tomcat server (comes with the Pentaho BI Server installation file) or with an existing Apache-Tomcat server.
You
will need to first download the biserver-ce-3.7.x.stable.zip
file from the Pentaho
Sourceforge projects page
- this file contains all the files/packages needed for setting up our
platform. After downloading extract its contents into a folder you
would like to store the Pentaho BI Server - in this example I have
chosen c:\pentaho\.
Use 7-Zip
to extract the file contents to C:\pentaho\
folder.
The following folders should be visible after you
have extracted the ZIP
file:
C:\
|--
pentaho
|
|-- adminstration-console
|
|-- biserver-ce
If
you would like to deploy the Pentaho BI Platform on an existing
Apache-Tomcat server first extract the contents of the
biserver-ce-3.7.x.stable.zip
file found on the Pentaho
Sourceforge projects page.
After downloading extract its contents into a folder you would like
to store the Pentaho BI Server - in this example I have chosen
c:\pentaho\
Use
7-Zip
to extract the file contents to C:\pentaho\
folder.
The following folders should be visible after you
have extracted the ZIP
file:
C:\
|--
pentaho
|
|-- adminstration-console
|
|-- biserver-ce
| |--
pentaho-solutions
| |--
tomcat
| |--
common
|
`-- lib
|
`--
mysql-connector-java-5.0.7.jar
(optional)
|
|-- webapps
|
`--
pentaho
|
`--
pentaho-styles
|
`--
sw-styles
The files and folders in bold (seen above) will need to be moved to your existing Apache-Tomcat installation.
If you already have Apache-Tomcat working with MySQL then you won't need to copy the mysql-connector-java-5.0.7.jar file to your tomcat's \common\lib folder. If not to copy the mysql-connector-java-5.0.7.jar to your tomcat's \common\lib folder.
You
will need to copy all the folders under the
C:\pentaho\biserver-ce\tomcat\webapps\
folder to the webapps\
folder under your existing Apache-Tomcat installation (the sw-styles
webapp is optional).
The last step is to move the
pentaho-solutions folder into the C:\pentaho\
folder or any other location which you would like to store all your
pentaho solutions and configuration files.
One
more step will need to be done to make sure Pentaho knows the new
location of the pentaho-solutions
folder which will be covered in the "Configuring Apache-Tomcat"
section.
The new structure of the pentaho-solutions
and existing Apache-Tomcat folder looks like this:
C:\
|-- pentaho
| `-- pentaho-solutions
|-- tomcat
| -- webapps
You
can now safely remove any other files that came with the original
Pentaho BI Platform (only under the biserver-ce/
folder).
A SQL Script Pack is a set off SQL scripts which will configure all the necessary databases. To download the SQL Script Pack for MySQL 5.x click here.
Extract the MySQL 5.x SQL Script Pack
After downloading the SQL Script Pack for MySQL 5.x you will need to extract the files into a temporary location. These are the five SQL scripts which should be visible after the pack has been extracted:
1_create_repository_mysql.sql
Creates the Hibernate database
2_create_quartz_mysql.sql
Creates the Quartz database
3_create_sample_datasource_mysql.sql
Loads the sample data data source into the Hibernate
database
4_load_sample_users_mysql.sql
Creates all the sample users and roles into the Hibernate
database
5_sample_data_mysql.sql
Creates the sample data database
You
must load the above scripts in the order they are listed.
Load these SQL scripts using your favourite MySQL front end tool i.e.
MySQL console, Navicat Lite, phpMyAdmin etc (in this section I will
show you how to load the SQL scripts using the MySQL console).
Before
you start make sure that you place all your SQL scripts in the folder
which you will be logging into the MySQL console, in this example
that is C:\pentaho\tmp.
Launch
a terminal instance and login into MySQL:
C:\pentaho\tmp> mysql -u [your_username] -p[your_password]
Issue the following commands found in bold one after the other:
mysql>
source
1_create_repository_mysql.sql;
...output
mysql>
source
2_create_quartz_mysql.sql;
...output
mysql>
source
3_create_sample_datasource_mysql.sql;
...output
mysql>
source
4_load_sample_users_mysql.sql;
...output
mysql>
source
5_sample_data_mysql.sql;
...output
Now
run the following command (in bold) to see if you have successfully
created the hibernate,
quartz
and
sampledata
databases:
mysql> show databases;
Just for reference here are the databases and tables which should of been created after loading the contents of the MySQL 5.x SQL Script pack:
hibernate*
authorities
datasource
granted_authorities
users
quartz
qrtz_blob_triggers
qrtz_calendars
qrtz_cron_triggers
qrtz_fired_triggers
qrtz_job_details
qrtz_job_listeners
qrtz_locks
qrtz_paused_trigger_grps
qrtz_scheduler_state
qrtz_simple_triggers
qrtz_trigger_listeners
qrtz_triggers
sampledata
customer_w_ter
customers
department_managers
dim_time
employees
offices
orderdetails
orderfact
orders
payments
products
quadrant_actuals
trial_balance
*
Hibernate will create new tables into the hibernate
database after Pentaho BI Platform has started for the first time.
NOTE↴
If you have problems when running the 5_sample_data_mysql.sql scripts with timestamps you will need to replace all occurances of 00.000000000 with 00.000000 - this is an issue with MySQL timezones and timestamps.
This section describes how to configure the Pentaho BI Platform JDBC security to use a MySQL server, this means the Pentaho BI Platform will now point to the hibernate database on the MySQL server instead of the packaged HSQL database.
NOTE↴
If you already have a user which you prefer to have access to the hibernate database instead of the default user hibuser, you will need to modify all occurances of hibuser/password in this section.
applicationContext-spring-security-jdbc.xml
This
file is located under the pentaho-solutions\system\
folder.
Once the file has opened locate this snippet of
code:
<!--
This is only for Hypersonic. Please update this section for any other
database you are using -->
<bean
id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property
name="driverClassName" value="org.hsqldb.jdbcDriver"
/>
<property
name="url"
value="jdbc:hsqldb:hsql://localhost:9001/hibernate"
/>
<property name="username" value="hibuser"
/>
<property name="password" value="password"
/>
</bean>
Make
changes to the highlighted sections so that the section of code looks
similar to this:
<!-- This is only for Hypersonic. Please update this section for any other database you are using -->
<bean id="dataSource"
class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="com.mysql.jdbc.Driver" />
<property name="url"
value="jdbc:mysql://localhost:3306/hibernate" />
<property name="username" value="hibuser" />
<property name="password" value="password" />
</bean>
applicationContext-spring-security-hibernate.properties
This
file is located under the pentaho-solutions\system\
folder.
Once the file has opened locate this snippet of
code:
jdbc.driver=org.hsqldb.jdbcDriver
jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.HSQLDialect
Make changes to the highlighted sections so that the section of code looks similar to this:
jdbc.driver=com.mysql.jdbc.Driver
jdbc.url=jdbc:mysql://localhost:3306/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.MySQLDialect
This
file is located under the pentaho-solutions\system\hibernate\
folder.
Once
the file has opened locate this snippet of code:
<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>
Make changes to the highlighted section so that the section of code looks similar to this:
<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>
mysql5.hibernate.cfg.xml (optional)
This
file is located under the pentaho-solutions\system\hibernate\
folder.
You
do not need to make any changes to this file if you would like to use
the default user hibuser
(which was created with the 4_load_sample_users_mysql.sql
file). However, if you would like to specify your own user find and
change the following two lines of code:
<property name="connection.username">hibuser</property>
<property name="connection.password">password</property>
Make changes to the highlighted sections to a username and password of your choice.
Configuring Hibernate and Quartz
context.xml
Hibernate and Quartz need to specifically use the hibernate and quartz databases which were created on the MySQL server. To do so modifications need to be made to the context.xml file which is located in the \tomcat\webapps\pentaho\META-INF\ folder.
NOTE↴
If you already have a user which you prefer to have access the hibernate database instead of the default user hibuser, you will need to modify all occurances of hibuser/password in this section.This also applies to the pentaho_user/password used to connect to the Quartz database.
Once
the file has opened the following piece of code should be visible:
<?xml version="1.0" encoding="UTF-8"?>
<Context path="/pentaho" docbase="webapps/pentaho/">
<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/hibernate"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />
<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/quartz"
validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES"/>
</Context>
Make changes to the highlighted sections so that the section of code looks similar to this:
<?xml version="1.0" encoding="UTF-8"?>
<Context path="/pentaho" docbase="webapps/pentaho/">
<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="hibuser" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"
validationQuery="select 1" />
<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="pentaho_user" password="password"
driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"
validationQuery="select 1"/>
</Context>
quartz.properties
In version 3.8 of the BI Server you will now need to make an additional change to quartz.properties.
NOTE↴
If you already have a user which you prefer to have access the hibernate database instead of the default user hibuser, you will need to modify all occurances of hibuser/password in this section.This also applies to the pentaho_user/password used to connect to the Quartz database.
Once
the file has opened the following piece of code should be visible:
#_replace_jobstore_properties
org.quartz.jobStore.misfireThreshold
= 60000
org.quartz.jobStore.driverDelegateClass
=
org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.useProperties
= false
org.quartz.jobStore.dataSource
= myDS
org.quartz.jobStore.tablePrefix
= QRTZ_
org.quartz.jobStore.isClustered
= false
Make
changes to the highlighted sections so that the section of code looks
similar to this:
#_replace_jobstore_properties
org.quartz.jobStore.misfireThreshold
= 60000
org.quartz.jobStore.driverDelegateClass
=
org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.useProperties
= true
org.quartz.jobStore.dataSource
= quartz
org.quartz.jobStore.tablePrefix
= QRTZ_
org.quartz.jobStore.isClustered
= false
The second change will
Configuring Apache-Tomcat Server
To configure the settings of the Apache-Tomcat server for your Pentaho BI Platform most of the changes are done inside the web.xml file which is located under the \tomcat\webapps\pentaho\WEB_INF\ folder. You are able to configure the following items (and more) for the Pentaho BI Platform:
solution-path
pentaho-solutions location
URL
Disable HSQL database startup
TrustedIpAddrs (optional - for the administration console and if you are accessing the server remotely)
If you are happy with the following settings for your Pentaho BI Platform server you will not need to make any changes to this file:
pentaho-solutions\ folder located under the C:\pentaho\ folder
Visit http://localhost:8080/pentaho URL to launch the Pentaho BI Platform
The solution-path parameter lets the Pentaho BI Platform know where to locate the pentaho-solutions folder. By default this is set to c:\biserver-ce\ folder.
If you have decided to use an existing Apache-Tomcat server (or have moved your pentaho-solutions folder) you will need to point this to where you have placed your pentaho-solutions folder. In this example my pentaho-solutions folder is under the C:\pentaho\ folder, now my solution-path code snippet looks like this:
<context-param>
<param-name>solution-path</param-name>
<param-value>C:\pentaho\pentaho-solutions</param-value>
</context-param>
If you are happy with visiting the URL http://localhost:8080/pentaho to access Pentaho's BI Platform you will not need to change this parameter, however if you would like others to access the site (remotely or on a network) you will need to make changes to this parameter.
Open up the file and locate this line of code:
<param-value>http://localhost:8080/pentaho/</param-value>
Make changes to the highlighted section to your PC or server's domain or IP address so it looks similar to this:
<param-value>http://www.prashantraju.com:8080/pentaho/</param-value>
or
<param-value>http://192.168.1.10:8080/pentaho/</param-value>
By default with 3.7 HSQL database starts up automatically - to prevent this from happening locate the following snippets of code:
<context-param>
<param-name>hsqldb-databases</param-name>
<param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value>
</context-param>
<listener>
<listener-class>org.pentaho.platform.web.http.context.Hsqldb StartupListener</listener-class>
</listener>
You
can either remove the above snippets or comment it out, if you are
commenting it out it will look similar to this:
<!--
<context-param>
<param-name>hsqldb-databases</param-name>
<param-value>sampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz</param-value>
</context-param>
<listener>
<listener-class>org.pentaho.platform.web.http.context.Hsqldb
StartupListener</listener-class>
</listener>
-->
If
you want to access your Tomcat-Apache server remotely - so in the
above step you have not specified localhost
or 127.0.0.1
for the base-url
parameter - you will need to add your Tomcat-Apache server's IP
address to this list.
Open up the file and locate this
line of code:
<param-name>TrustedIpAddrs</param-name>
<param-value>127.0.0.1</param-value>
Make changes to the highlighted section add your PC or server's domain or IP address so it looks similar to this:
<param-name>TrustedIpAddrs</param-name>
<param-value>127.0.0.1,[your_ip_address]</param-value>
This will allow the Pentaho Administration Console to 'ping' the server to see if it is up or down - you do not need to do this if you are hosting your server locally.
You can also change the local language and country under the web.xml file, the changes to these parameters are self explanatory.
Configuring SMTP (mail server)
To
configure the Pentaho BI Platform to use a SMTP server (mail server)
to use for emailing reports etc. you will need to make modifications
to the \pentaho-solutions\system\email_config.xml
file.
Here are the available parameters that can be
configured for SMTP support:
mail.smtp.host
This is the address of your SMTP email server for sending email e.g. smtp.gmail.com
mail.smtp.port
This is the port of your SMTP email server e.g. for GMail this is 587
mail.transport.protocol
The transport for accessing the email server. Usually this is smtp e.g. for GMail this is smtps
mail.smtp.starttls.enable
If you SMTP server uses TTLS authentication set this to true e.g. for GMail this is true
mail.smtp.auth
Set to true if the email server requires the sender to authenticate
mail.smtp.ssl
This is true if the email server requires an SSL connection e.g. for GMail this is true
mail.debug
Output debug information from the JavaMail API
mail.pop3
Not being used.
mail.from.default
The from address that emails from the Pentaho BI Platform e.g. prashantraju@gmail.com
mail.userid
The userid that is used when authenticating with the SMTP server, mail.smtp.auth must be set to true.
mail.password
The password that is used when authenticating with the SMTP server, mail.smtp.auth must be set to true.
Here
is an example of a smtp-email.xml
file configured for GMail:
<email-smtp>
<properties>
<mail.smtp.host>smtp.gmail.com</mail.smtp.host>
<mail.smtp.port>587</mail.smtp.port>
<mail.transport.protocol>smtps</mail.transport.protocol>
<mail.smtp.starttls.enable>true</mail.smtp.starttls.enable>
<mail.smtp.auth>true</mail.smtp.auth>
<mail.smtp.ssl>true</mail.smtp.ssl>
<mail.smtp.quitwait>false</mail.smtp.quitwait>
</properties>
<mail.pop3></mail.pop3>
<mail.from.default>pentahoadmin@gmail.com</mail.from.default>
<mail.userid>pentahoadmin@gmail.com</mail.userid>
<mail.password>password</mail.password>
</email-smtp>
By default publishing is not enabled, to enable it you will need to specify a password which will need to be used when publishing. To get started you will need to edit the publisher_config.xml file located under the pentaho-solutions\system\ folder, once open locate the following snippet of code:
<publisher-config>
<publisher-password></publisher-password>
</publisher-config>
Enter a password between the publisher-password tags (this password will be the same for all users) so the snippet of code looks similar to the example below (in this example the publisher password is publishthis):
<publisher-config>
<publisher-password>publishthis</publisher-password>
</publisher-config>
From now on when any user tries to publish content to Pentaho BI Platform they will need to specify this password.
Configuring the Administration Console
By default no configuration is needed when setting up Pentaho's Administration Console (PAC) with MySQL 5.x.
Starting the Business Intelligence Platform
The Pentaho BI Platform is a webapp on the Apache-Tomcat server. To start Apache-Tomcat you will need to setup Apache-Tomcat as a service which is a lot easier to start and stop (skip this step if you are using an existing installation of Apache-Tomcat). At the command prompt issue the following command (in bold):
C:\pentaho\biserver-ce\tomcat\bin> service.bat install tomcat5
Installing the service 'tomcat5' ...
Using CATALINA_HOME: D:\pentaho\biserver-ce\tomcat
Using CATALINA_BASE: D:\pentaho\biserver-ce\tomcat
Using JAVA_HOME: C:\Program Files\Java\jdk1.6.0_13
Using JVM: C:\Program Files\Java\jdk1.6.0_13\jre\bin\server\jvm.dll
The service 'tomcat5' has been installed.
Once you have received the above output the next step is to start the Tomcat service. To do this firstly click on the Start button then Run and type in services.msc and click OK. A Services window should appear and it will list all available services, locate the Apache Tomcat tomcat5 service and double click on it to open up the Properties dialog box:
To
start Tomcat click on the Start
button (to stop Tomcat simply click on the Stop
button).
Now you should be able to visit http://localhost:8080/pentaho or http://[your_domain_or_ip]:8080/pentaho. If the Pentaho BI Platform has started successfully you should see the following welcome screen:
After
logging in try and run a sample report from the Steel
Wheels
solution folder:
Starting the Administration Console
To start the Administration Console you will need to run the start-pac.bat file which is located under the c:\pentaho\adminstration-console\ folder after double clicking on the file a new command prompt window and should display something similar to the output below:
DEBUG: Using JAVA_HOME
DEBUG: _PENTAHO_JAVA_HOME=C:\Program Files\Java\jdk1.6.0_13
DEBUG: _PENTAHO_JAVA=C:\Program Files\Java\jdk1.6.0_13\bin\java.exe
2010-01-05 16:27:17.824::INFO: Logging to STDERR via org.mortbay.log.StdErrLog
05/01/2010 4:27:18 PM org.pentaho.pac.server.JettyServer startServer
INFO: Console is starting
2010-01-05 16:27:18.118::INFO: jetty-6.1.2
2010-01-05 16:27:38.672::INFO: Started SocketConnector @ 0.0.0.0:8099
05/01/2010 4:27:38 PM org.pentaho.pac.server.JettyServer startServer
INFO: Console is now started. It can be accessed using http://D119940:8099 or http://161.117.117.40:8099
Now you should be able to visit http://localhost:8099/ or the other two address's specified in your output (highlighted above). You will be prompted for a Username and Password which by default are "admin" and "password". If you have successfully started and logged into the administration console you should see the following welcome screen: