SAP NLS With HADOOP
Solutions for data growth
To overcome challenges in data growth, SAP provides a wide array of data management techniques such as SAP Near-Line Storage, Archiving and so on to address the complex necessities of your business. These requirements often include regulatory compliance and the cost involved in maintenance and upgradation of the IT environment.
Archive SAP BW data to HADOOP using NLS solution
To overcome the challenges in data growth, SAP provides a wide array of data management techniques to unload data using Near-Line Storage (NLS) technology. Earlier SAP comprised of features to archive the data to SAP IQ, but now with the advancement of technology, they have launched new features like temperature based concept and new trends for offloading the data. HADOOP can be chosen as new storage for warm and cold data. It creates an equilibrium for low storage costs, easily extendable features and fast performance.
What is Near-Line Storage?
Near-Line storage is an intermediate type of storage that represents a compromise between online storage and offline storage/archiving. Online storage is available for I/O. On the other side, offline storage is not immediately available and requires human intervention to become online. This is where Near-Line storage gives you advantage. It is not immediately available and can be made online without any human intervention.
For a rapidly growing BW system, or BW on HANA (or all of the above), it is important to understand the concept of NLS. No matter if your system is on-premise or on cloud, you can save a huge amount with clever data management and NLS. The total cost of ownership of BW data storage on HADOOP database can be considered as a “side car” scenario. It is a great approach to save BW database space by moving less frequently used data to NLS. Some customers run their HADOOP cluster on premise by themselves, and are satisfied with this solution. This feature is supported by major cloud providers like Azure and AWS.
SAP BW related features for NLS
SAP BW running on a SAP HANA database:
- Data in NLS, direct reading via analytic queries is fully supported.
- We can use SAP HANA Smart Data Access (SDA) to access the data if you use SAP Spark SQL Adapter
SAP BW running on other database:
- Only archived data is supported.
- Query access and reading are not supported.
By using this solution, you can get the data whenever you require without compromising your IT infrastructure.
Near-Line storage with HADOOP and SAP HANA:
Most of the companies using SAP experience huge data growth through expansion. Maintaining transactional data online, particularly after it is closed, is expensive, unusable, and risky. Storage is not the main challenge here but data management is. Swiftly growing data and document volumes cause system performance and productivity to drop quickly, which in turn leads to unsatisfying user experience and burdening IT with higher maintenance costs.
Near-Line storage is mainly a temperature based data management strategy. Not all data is accessible frequently in SAP HANA. But it has to be stored in memory which increases the amount of main memory. Less used data i.e., warm/cold data is archived in HADOOP using NLS solution which is most efficient and cost effective. Archiving data in a separate storage leads to reduction in main memory storage, frees up the hardware resources involved in heavy processing and also makes the static data available for use. In HADOOP, HDFS is good for storing cold data whereas HIVE is good for storing warm data. This archived data can be retrieved by the user whenever it is required by using queries. Along with storage mechanism, Hadoop also helps in real time data loading, parallel processing of complex data and discovering unknown relationships in the data.
Benefits with HADOOP
- It can be easily extendible with its distributed computing
- It is fast in reporting
- It scales to large amount of big data storage
- Flexible with programming languages
Not only archiving, HADOOP can also be used for other BIGDATA scenarios in parallel. All cloud providers such as Azure and AWS support this solution.
Integration of SAP HANA with HADOOP requires SAP HANA Spark controller (recommended) installed and Hive ODBC driver on HADOOP server. Before that, you need to prepare the HADOOP server by creating dedicated database and defining required services. Next, you will have to prepare HANA by creating remote source of SAP BW and also SAP BW system with respective RFC’s. You require at least SAP_BW 7.50 Support Package 4 and below mentioned requirements on HADOOP and HANA.
- Core Hadoop version 2.7.1 (HDFS, MapReduce2, YARN).
- WebHDFS is recommended for ReST based HDFS access, HTTPFS claims to be interoperable, but is untested.
- Tez 0.7.0 as execution engine for Hive (might be replaced by plain MapReduce2)
- Hive 1.2.1
- Spark 1.5.2 (required for SAP HANA Spark Controller or DPAgent via Spark Thrift Server)
- Knox Gateway 0.6.0 (optional)
- HANA 1.0 Support Package 12 (05/2016)
- SAP HANA Spark Controller 2.0 SP 01 Patch 1
Refer to the below links to integrate HANA with HADOOP in detail. After successful installation of both the systems, you can start archiving data on the basis of temperature strategy.
Contact for further details
Technology Specialist – SAP BASIS