Author Archives: Shanoj
AWS Data Lake Solution | A comprehensive & complete overview
In-Place Querying in S3 based AWS Data Lake solutions
What is AWS Global Infrastructure?
Setting storage driver in docker
Reference:
https://docs.docker.com/storage/storagedriver/select-storage-driver/
https://docs.docker.com/storage/storagedriver/
| Linux distribution | Recommended storage drivers | Alternative drivers |
|---|---|---|
| Docker Engine – Community on Ubuntu | overlay2 or aufs (for Ubuntu 14.04 running on kernel 3.13) | overlay¹, devicemapper², zfs, vfs |
| Docker Engine – Community on Debian | overlay2 (Debian Stretch), aufs or devicemapper (older versions) | overlay¹, vfs |
| Docker Engine – Community on CentOS | overlay2 | overlay¹, devicemapper², zfs, vfs |
| Docker Engine – Community on Fedora | overlay2 | overlay¹, devicemapper², zfs, vfs |
Get the current storage driver:
docker info

Set the storage driver explicitly using the daemon configuration file. This is the method that Docker recommends.
sudo vi /etc/docker/daemon.json
Add the details of storage driver in the daemon configuration file:
{
"storage-driver": "devicemapper"
}
Restart Docker after editing the file.
sudo systemctl restart docker
sudo systemctl status docker
Installing Docker on CentOS

Install required packages, these packages are pre-requsite for docker installation on CentOS:
sudo yum install -y device-mapper-persistent-data lvm2
Add the Docker CE repo:
sudo yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
Install the Docker CE packages and containerd.io:
sudo yum install -y docker-ce-18.09.5 docker-ce-cli-18.09.5 containerd.io
Start and enable the Docker service:
sudo systemctl start docker
sudo systemctl enable docker
Add test_user to the docker group, giving the user permission to run docker commands:
sudo usermod -a -G docker test_user
Log out and back log in and test the installation by running a simple container:
docker run hello-world
I am very exited, my book is out for sale.
Looking for oracle RAC administration jobs?
This book will provide complete details on Oracle RAC administration interview questions and answers. This book helps you in cracking your interview & acquire your dream career as Oracle RAC Administrator. This book is a perfect companion to stand ahead above the rest in today’s competitive job market.
Sections to be discussed:
Basic to advance RAC administration interview Questions
RAC installation Questions
RAC Upgrade/Patching Questions
RAC Data Guard Configuration Questions
RAC troubleshooting Questions
390 Oracle RAC administration interview questions for getting hired as an Oracle Database RAC administration.
Using Kafka Connect to Capture Data from a Relational Database (sqlite3)
Use any Kafka docker images to install and start kafka.
reference:
https://docs.confluent.io/current/connect/userguide.html
https://github.com/bitnami/bitnami-docker-kafka
https://docs.confluent.io/3.1.1/connect/connect-jdbc/docs/sink_connector.html
JDBC driver download for SQLlite3:
https://bitbucket.org/xerial/sqlite-jdbc/downloads/
- Start Kafka.
confluent start
- Install SQLite3.
apt-get update apt-get install sqlite3
- Create a New Database and Populate It with a Table and Some Data
Create a new database called “test.db”.
root@shanoj_srv1:/# sqlite3 test.db
- Create a new table in the SQLite database called “accounts”.
CREATE TABLE accounts (id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, name VARCHAR (255));
- Insert values into the table to begin populating it.
INSERT INTO accounts(name) VALUES('sabu');
INSERT INTO accounts(name) VALUES('ronnie');
.quit
- Stop Kafka Connect.
confluent stop connect
- Make necessary changes to below files:
root@shanoj_srv1:/# vi /etc/schema-registry/connect-avro-standalone.properties bootstrap.servers=localhost:9092 key.converter=io.confluent.connect.avro.AvroConverter key.converter.schema.registry.url=http://localhost:8081 value.converter=io.confluent.connect.avro.AvroConverter value.converter.schema.registry.url=http://localhost:8081 # The internal converter used for offsets and config data is configurable and must be specified, internal.key.converter=org.apache.kafka.connect.json.JsonConverter internal.value.converter=org.apache.kafka.connect.json.JsonConverter internal.key.converter.schemas.enable=false internal.value.converter.schemas.enable=false # Local storage file for offset data offset.storage.file.filename=/tmp/connect.offsets
root@shanoj_srv1:/# vi etc/kafka-connect-jdbc/source-quickstart-sqlite.properties # A simple example that copies all tables from a SQLite database. The first few settings are # required for all connectors: a name, the connector class to run, and the maximum number of # tasks to create: name=test-source-sqlite-jdbc-autoincrement connector.class=io.confluent.connect.jdbc.JdbcSourceConnector tasks.max=1 # The remaining configs are specific to the JDBC source connector. In this example, we connect to a # SQLite database stored in the file test.db, use and auto-incrementing column called 'id' to # detect new rows as they are added, and output to topics prefixed with 'test-sqlite-jdbc-', e.g. # a table called 'users' will be written to the topic 'test-sqlite-jdbc-users'. connection.url=jdbc:sqlite:test.db mode=incrementing incrementing.column.name=id
- Start Kafka Connect in standalone mode.
root@shanoj_srv1:/#connect-standalone -daemon /etc/schema-registry/connect-avro-standalone.properties /etc/kafka-connect-jdbc/source-quickstart-sqlite.properties
- Verify that the connector was created.
root@shanoj_srv1:/# cat /logs/connectStandalone.out | grep -i "finished"
[2019-08-15 15:45:49,421] INFO Finished creating connector test-source-sqlite-jdbc-autoincrement (org.apache.kafka.connect.runtime.Worker:225)
[2019-08-15 15:45:49,504] INFO Source task WorkerSourceTask{id=test-source-sqlite-jdbc-autoincrement-0} finished initialization and start (org.apache.kafka.connect.runtime.WorkerSourceTask:143)
[2019-08-15 15:46:49,484] INFO Finished WorkerSourceTask{id=test-source-sqlite-jdbc-autoincrement-0} commitOffsets successfully in 6 ms (org.apache.kafka.connect.runtime.WorkerSourceTask:373)
root@shanoj_srv1:/# curl -s localhost:8083/connectors
- Examine the Kafka topic created.
root@shanoj_srv1:/# kafka-topics --list --zookeeper localhost:2181 | grep test-sqlite-jdbc test-sqlite-jdbc-accounts
Start a Kafka Consumer and Write New Data to the Database
- Open a Kafka consumer.
root@shanoj_srv1:/# kafka-avro-console-consumer --new-consumer --bootstrap-server localhost:9092 --topic test-sqlite-jdbc-accounts --from-beginning
Open a new tab to a new terminal session.
Open a new shell in this session.
root@shanoj_srv1:/# sudo docker exec -it sqlite-test //bin//bash
- Transfer to the tmp directory.
root@shanoj_srv1:/# cd /tmp
- Access the SQLite database test.db.
root@shanoj_srv1:/# sqlite3 test.db
- Insert a new value into the accounts table.
root@shanoj_srv1:/tmp# sqlite3 test.db
SQLite version 3.8.7.1 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> INSERT INTO accounts(name) VALUES('rama');
sqlite> INSERT INTO accounts(name) VALUES('lev');
sqlite> INSERT INTO accounts(name) VALUES('sriram');
sqlite> INSERT INTO accounts(name) VALUES('joby');
sqlite> INSERT INTO accounts(name) VALUES('shanoj');
sqlite>
- Return to the previous session with the consumer and verify the data has been written.
root@ip-10-0-1-100:/# kafka-avro-console-consumer --new-consumer --bootstrap-server localhost:9092 --topic test-sqlite-jdbc-accounts --from-beginning
{"id":3,"name":{"string":"rama"}}
{"id":4,"name":{"string":"lev"}}
{"id":5,"name":{"string":"sriram"}}
{"id":6,"name":{"string":"joby"}}
{"id":7,"name":{"string":"shanoj"}}
Install and Configure PostgreSQL 9.x: RHEL/CentOS
1.Download and install it using the appropriate package management
~ $ rpm -Uvh https://yum.postgresql.org/9.4/redhat/rhel-7-x86_64/pgdg-centos94-9.4-3.noarch.rpm Retrieving https://yum.postgresql.org/9.4/redhat/rhel-7-x86_64/pgdg-centos94-9.4-3.noarch.rpm warning: /var/tmp/rpm-tmp.IZow7N: Header V4 DSA/SHA1 Signature, key ID 442df0f8: NOKEY Preparing... ################################# [100%] Updating / installing... 1:pgdg-redhat-repo-42.0-4 ################################# [100%]
2. Applying any necessary updates.
[root@tcox6 ~]# yum update
3. Install the PostgreSQL 9.4 server and associated contribed modules and utilities. Once installed, run the database initialization routine before starting the database.
[root@tcox6 ~]# yum install postgresql94-server postgresql94-contrib
4. Enable the PostgreSQL 9.4 server to run on system start and then start the database server.
[root@tcox6 ~]# systemctl enable postgresql-9.4
ln -s '/usr/lib/systemd/system/postgresql-9.4.service' '/etc/systemd/system/multi-user.target.wants/postgresql-9.4.service'
[root@tcox6 ~]# systemctl start postgresql-9.4
5. Check to see if SELinux is being run in enforced mode on your system. If so, run the command to allow external HTTP DB connections to the server through SELinux configuration.
# cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
# enforcing - SELinux security policy is enforced.
# permissive - SELinux prints warnings instead of enforcing.
# disabled - No SELinux policy is loaded.
SELINUX=enforcing
# SELINUXTYPE= can take one of these two values:
# targeted - Targeted processes are protected,
# minimum - Modification of targeted policy. Only selected processes are protected.
# mls - Multi Level Security protection.
SELINUXTYPE=targeted
# setsebool -P httpd_can_network_connect_db 1
6. Login to the ‘postgres’ user and run the ‘psql’ command. Once at the database prompt, set a password for the ‘psql’ user.
[root@tcox6 ~]# su - postgres
Last login: Wed Sep 2 13:35:21 UTC 2015 on pts/0
-bash-4.2$ psql
psql (9.4.4)
Type "help" for help.
postgres=# \password postgres
Enter new password:
Enter it again:
postgres=# quit
postgres-# \q
-bash-4.2$ exit
logout
Oracle Exadata Interview Questions and Answers:
-
1) What are the advantages of Exadata?The Exadata cluster allows for consistent performance while allowing for increased throughput. As load increases on the cluster the performance remains consistent by utilizing inter-instance and intra-instance parallelism.It should not be expected that just moving to Exadata will improve performance. In most cases it will especially if the current database host is overloaded.2) What is the secret behind Exadata’s higher throughput?Exadata ships less data through the pipes between the storage and the database nodes and other nodes in the RAC cluster.Also it’s ability to do massive parallelism by running parallel processes across all the nodes in the cluster provides it much higher level of throughput.It also has much bigger pipes in the cluster using Infiniband interconnect for inter-instance data block transfers as high as 5X of fiberchannel networks.3) What are the key Hardware components?DB ServerStorage Server CellsHigh Speed Infiniband SwitchCisco Switch
-
4) What are the Key Software Features?Smart Scan,Smart Flash CacheStorage IndexExadata Hybrid Columnar Compression (EHCC)IORM (I/O Resource Manager)5) What is a Cell and Grid Disk?Cell and Grid Disk are a logical component of the physical Exadata storage. A cell or Exadata Storage server cell is a combination of Disk Drives put together to store user data. Each Cell Disk corresponds to a LUN (Logical Unit) which has been formatted by the Exadata Storage Server Software. Typically, each cell has 12 disk drives mapped to it.Grid Disks are created on top of Cell Disks and are presented to Oracle ASM as ASM disks. Space is allocated in chunks from the outer tracks of the Cell disk and moving inwards. One can have multiple Grid Disks per Cell disk.
-
6) What is IORM?IORM stands for I/O Resource Manager.It manages the I/O demand based on the configuration, with the amount of resources available. It ensures that none of the I/O cells become oversubscribed with the I/O requests. This is achieved by managing the incoming requests at a consumer group level.Using IORM, you can divide the I/O bandwidth between multiple databases.To implement IORM resource groups, consumers and plans need to be created first.7) What is hybrid columnar compression?
-
Hybrid Columnar compression, also called HCC, is a feature of Exadata which is used for compressing data at column level for a table.It creates compression data units which consist of logical grouping of columns values typically having several data blocks in it. Each data block has data from columns for multiple rows.This logarithm has the potential to reduce the storage used by the data and reduce disk I/O enhancing performance for the queries.The different types of HCC compression include:• Query Low• Query High• Archive High• Archive Low8) What is Flash cache?Four 96G PCIe flash memory cards are present on each Exadata Storage Server cell which provide very fast access to the data stored on it.This is further achieved by also provides mechanism to reduces data access latency by retrieving data from memory rather than having to access data from disk. A total flash storage of 384GB per cell is available on the Exadata appliance.9) What is Smart Scan?It is a feature of the Exadata Software which enhances the database performance many times over. It processes queries in an intelligent way, retrieving specific rows rather than the complete blocks.It applies filtering criteria at the storage level based on the selection criteria specified in the query.It also performs column projection which is a process of sending only required columns for the query back to the database host/instance.10) What are the Parallelism instance parameter used in Exadata?
-
The parameter PARALLEL_FORCE_LOCAL can be specified at the session level for a particular job.11) How do you Test performance of Exadata?You can use the “calibrate” commands at the cellcli command line.12)What are the ways to migrate onto Exadata?
-
Depending on the downtime allowed there are several options:Oracle DataGuardTraditional Export/ImportTablespace transportationGoldengate Replication after a data restore onto Exadata.13) What types of operations does Exadata “offload”?Some of the operations that are offloaded from the database host to the cell servers are:Predicate filteringColumn project filteringJoin processingBackups14) What is cellcli?This is the command line utility used to managed the cell storage.15) How do you create obtain info on the Celldisks?At the cellcli command line you can issue the “list celldisk” command.16) How would you create a grid disk?At the cellcli command you would need to issue the “create grididsk all ..” command.16) What are the cellinit.ora and the cellip.ora files used for?These files have the hostnames and the ip address of all the nodes in the cluster. They are used to run commands on remote database and cellserver nodes from a local host.17) Which package can be used to estimate the compression ration of table?DBMS_COMPRESSION18) Background services of Cell ServerMS- Management Servercellsrv – Cell ServerRS – Restart Server19) How many disk comes with in a storage cell?1220) What is the purpose of spine switch?Spine switch is used to connect or add more Exadata machine in the cluster21) How to migrate database from normal setup to Exadata ?There many methods we can use to migrate DB to Exadata. Below are some of them.1. Export/Import2. Physical Standby3. Logical Standby4. Transportable Tablespace5. Transportable Database6. Golden gate7. RMAN cold and hot backup restoration8. Oracle Streams22) Can we use flash disk as ASM disk?Yes23) Which protocol used for communication between database server and storage server?iDB protocol24) which OS is supports in Exadata?Database servers has two option for OS either Linux or Solaris which can be finalized at the time of configuration Cell storage comes with Linux only25) What is ASR?ASR is the tool to manage the Oracle hardware. Full form of ASR is Auto Service Request. Whenever any hardware fault occurs ASR automatically raise SR in Oracle Support and send notification to respective customer.26) How to upgrade firmware of Exadata components?It can be done through ILOM of DB or Cell server.27) Where we can define which cell storage can be used by particular database server?CELLIP.ORA file contains the list of storage server which is accessed by DB server.28) What are the Exadata Health check tools?1. Exacheck2. sundiagtest3. oswatcher4. OEM 12c29) What is EHCC?EHCC is Exadata Hybrid Columnar Compression which is used to compress data in the Database.30) What is offloading and how it works?It refers to the fact that part of the traditional SQL processing done by the database can be “offloaded” from the database layer to the storage layerThe primary benefit of Offloading is the reduction in the volume of data that must be returned to the database server. This is one of the major bottlenecks of most large databases.31) What is the difference between cellcli and dcli?Cellcli can be used on respective cell storage only.DCLi (Distributed command Line Utility) – DCLI can be used to replicate command on multipla storage as well as DB servers.32) What is IORM and what is its role in Exadata?IORM stand for I/O Resource Manager which manages the I/Os of multiple database on storage cell.33) How we can check whether oracle best practice has been configured on Exadata?We can execute Exacheck and verify the best practice setup on Exadata machine.34) How many networks required in Exadata?1. Public/Client Network — For Application Connectivity2. Management Network — For Exadata H/W management3. Private Network — For cluster inter connectivity and Storage connectivity35) What is the command to enable query high compression on table?SQL>alter table table_name move compress for query high;36) How to take cell storage software backup?It is not required to take a backup as it happens automatically. Exadata use internal USB drive called the Cellboot Flash Drive to take backup of software.37) What is the difference between wright-through and write-back flashcache mode?1. writethrough –> Falshcache will be used only for reading purpose2. writeback –> Flashcache will be used for both reading and writing38) Which feature of Exadata is used to eliminate disk IO?Flash Cache39) What is the capacity of Infiniband port ?40 Gbps40) What is the difference between high capacity and high performance disk?1. High capacity disk comes with more storage space and less rpm (7.5k)2. High Performance disk comes with less storage and high rpm (15k)41) When one should execute Exacheck?Before and after any configuration change in Database Machine42) What is grid disk?Grid Disks are created on top of Cell Disks and are presented to Oracle ASM as ASM disks.Space is allocated in chunks from the outer tracks of the Cell disk and moving inwards. One can have multiple Grid Disks per Cell disk.43) Which network is used for RAC inter-connectivity?Infiniband Network44) What is Smart Scan?It is a feature of the Exadata Software which enhances the database performance many times over. It processes queries in an intelligent way, retrieving specific rows rather than the complete blocks. It applies filtering criteria at the storage level based on the selection criteria specified in the query. It also performs column projection which is a process of sending only required columns for the query back to the database host/instance.45) What are the Parallelism instance parameter used in Exadata?The parameter PARALLEL_FORCE_LOCAL can be specified at the session level for a particular job.46) Which statistic can be used to check flash hit ration on database level?Cell flash cache read hits47) Which disk group is used to keep OCR files on Exadata?+DBFS_DG48) How many Exadata wait events contained in 11.2.0.3 release?There are 53 wait events are exadata specific events.49) What is the difference between DBRM and IORM?DBRM is the feature of database while IORM is the feature of storage server software.50) Which ASM parameters are responsible for Auto disk management in Exadata?_AUTO_MANAGE_MAX_ONLINE_TRIES — It controls maximum number of attempts to make disk Online_AUTO_MANAGE_EXADATA_DISKS — It control auto disk management feature_AUTO_MANAGE_NUM_TRIES — It controls maximum number of attempt to perform an automatic operation51) How to enable Flashcache compression?CellCLI> ALTER CELL flashCacheCompress=true52) How many Exadata Storage Server Nodes are included in Exadata Database Machine X4-8?14 storage nodes53) What is client or public network in exadata?Client or public network is used to established connectivity between database and application.54) What are the steps involved for initial Exadata configuration?Initial network preparationConfigure Exadata serversConfigure Exadata softwareConfigure database hosts to use ExadataConfigure ASM and database instancesConfigure ASM disk group for Exadata55) What is iDB protocol?iDB stands for intelligent database protocol. It is a network based protocol which is responsible to communicate between storage cell and database server.56) What is LIBCELL?Libcell stands for Library Cell which is linked with Oracle kernel. It allows oracle kernel to talk with the storage server via network based instead of operating system reads and writes.57) Which packaged is used by compression adviser utility?DBMS_COMPRESSION package58) What is the primary goal of storage index?Storage indexes are a feature unique to the Exadata Database Machine whose primary goal is to reduce the amount of I/O required to service I/O requests for Exadata Smart Scan.59) What is smart scan offloading?Offloading and Smart Scan are two terms that are used somewhat interchangeably. Exadata SmartScan offloads processing of queries from the database server to the storage server.Processors on the Exadata Storage Server process the data on behalf of the database SQL query. Only the data requested in the query is returned to the database server.60) What is checkip and what the use of it?Checkip is the OS level script which contains IP address and hostname which will be used by Exadata in configuration phase. It checks network readiness like proper DNS configuration, it also checks there is no IP duplication in the network by pinging it which not supposed to ping initially.61) Which script is used to reclaim the disk space of unused operating system?For Linux: reclaimdisks.shFor Solaris: reclaimdisks.pl62) How database server communicates to storage cell?Database server communicates with storage cell through infiniband network.63) Can I have multiple celldisk for one grid disk?No. Celldisk can have multiple griddisk but griddisk cannot have multiple celldisk64) How many FMods available on each flash card?Four FMods (Flash Modules) are available on each flash card.65) What is smart flash log?Smart flash log is a temporary storage area on Exadata smart flash cache to store redoes log data.66) Which parameter is used to enable and disable the smart scan?cell_offload_processing67) How to check infiniband topology?We can verify infiniband switch topology by executing verify-topology script from one of our database server.68) Can we use HCC on non-exadata environment?No, HCC is only available data stored on Exadata storage server.69) What is resource plan?It is collection of plan directives that determines how database resources are to be allocated.70) What is DBFS?DBFS stands for Database File system which can be built on ASM disk group using database tablespace.71) What is the purpose of infiniband spine switch?Spine switch is used to connect multiple exadata database machines.72) What is offload block filtering?Exadata storage server filters out the blocks that are not required for the incremental backup in progress so only the blocks that are required for the backup are sent to the database.73) Which protocol used by ASR to send notification?SNMP74) Is manually intervance possible in storage index?No75) What are the options to update cell_flashcache for any object?KEEPDEFAULTNONE76) What is the default size of smart flash log?512MB per module.Each storage cell having 4 modules so its 4X512 MB per CELL77) What is flash cache and how it works?The flash cache is a hardware component configured in the exadata storage cell server which delivers high performance in read and write operations.Primary task of smart flash cache is to hold frequently accessed data in flash cache so next time if same data required than physical read can be avoided by reading the data from flash cache.

