How JustAnswer Works:
  • Ask an Expert
    Experts are full of valuable knowledge and are ready to help with any question. Credentials confirmed by a Fortune 500 verification firm.
  • Get a Professional Answer
    Via email, text message, or notification as you wait on our site.
    Ask follow up questions if you need to.
  • 100% Satisfaction Guarantee
    Rate the answer you receive.
Ask Leela-TheProgrammer Your Own Question
Leela-TheProgrammer
Leela-TheProgrammer, Computer Software Engineer
Category: Programming
Satisfied Customers: 474
Experience:  Post Grad in CS (Gold Medal)
21067471
Type Your Programming Question Here...
Leela-TheProgrammer is online now
A new question is answered every 9 seconds

My client is giving us access to a large database in

Customer Question

My client is giving us access to a large database in oracle and we have to audit the quality of the data. They have requested us to use java so that we are done with the audit they can have they team continue to audit as business rules change. Looking for a Java library that would make it easier to do the basic testing. IE * valid range of numbers dates etc for each file. * Does the unique key in the transaction file exist in the master table etc * Name field in multiple tables are the same Business rules check the sequence of transactions in the transaction file to make sure they conform to a limited list of valid sequences

Submitted: 1 year ago.
Category: Programming
Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

Thanks for using JustAnswer.

This is Leela and I will help you with the question today.

Regards,

Leela

Expert:  Leela-TheProgrammer replied 1 year ago.

I understand that you want to analyze DATA from big data base on some business rules.

If you can provide more details on the VOLUME of data, like number of records, number of tables, I can help you with details on analysis.

Thanks,

Leela

Customer: replied 1 year ago.

About 1000 tables with 16000 fields and total volume is about 200 terabytes.

Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

Wow. That is really a huge data. You would need to use HADOOP or any other Big Data system in order to LOAD and ANALYZE the data.

Will you be doing a GENERAL analysis or looking for specific insights in the data?

Thanks,

Leela

Customer: replied 1 year ago.
I had the same thought now the real question is do you know of any Java libraries that can give me a head start in verifying the data don't want to start from scratch. There has to be some java libraries that at least do the basics
The hard stuff I would think might have to be done by custom programming yet who knows there might be some Java libraries that would give me a head start
Any ideas
Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

I agree with your thoughts. Can you please elaborate a bit more on what do you mean by VERIFYING data? Is it looking for missing column values in a row or anything else?

We have java libraries to connect to databases, we can easily use them to run SQL queries and find some insights, but that is going to take lot of time and you would need HIGH END servers to run the program.

When I said Hadoop system I mean MapReduce which is a java program or you can use Spark also to analyse the data stored in Hadoop.

First I want to help you with the Java library question.. please share more details on the data validation part.

If needed we can have a quick call for a quick discussion, please let me know.

Thanks,

Leela

Customer: replied 1 year ago.
To start with the basics. Each field has a valid range (numbers, date ranges in some cases lowest to highest and in other cases the low and High relate to another date field in the record set). Financial validation where the transaction table that shows monies coming and going is correctly reflected in the record set total balance field.
The are also multiple tables that have the same field as a different name but should match the master record
Looking for orphan records where transaction tables don't have a master record.
And the big one (hard one). There are several transactional tables that contain transaction codes. Some sequences are valid others are not and the final sequence should match the current status of the record set
Any ideas
Expert:  Leela-TheProgrammer replied 1 year ago.

Thanks for sharing the details, I would check and get back to you in sometime.

Regards,

Leela

Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

I have analyzed your needs, as you are are looking for a java library, you can use Hibernate Validator in order to validate the data as per the specific constraints you have.

http://hibernate.org/validator/

Please check and let me know if this helps.

Regards,

Leela

Customer: replied 1 year ago.

The Hibernate.org does look good for most of what I need to do.

However the most complex of the test involves looking at several transaction tables that have a type of transaction. There are certain valid transaction series and others that I am not sure what they really mean. How can I get a list of all valid transactions sequences and also the ones that are not valid. with a software tool?

Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

Glad you see value in hibernate, it is a great offering for java user to interact with databases.

With regards ***** ***** question about VALID TRANSACTION.. A transaction is just a ROW in a database table, whether a row is VALID or not is defined by some BUSINESS RULES and not a GENERAL logic. So i am not very sure if there can be any rule which can automatically find if a RULE is VALID or not.

Mostly you can use Hibernate Validator and define your business rules and try to validate them.

Hope this helps.

Thanks,

Leela

Customer: replied 1 year ago.
Each row by itself is mean less it is a set of different lengths on over a specific length of days that is either valid or not. Someone suggested looking for a linage tool not sure I understand how that could help
Expert:  Leela-TheProgrammer replied 1 year ago.

Hi,

By business logic I meant the way you tried to define a VALID transaction, as per my understanding there is no automatic way, but would require some programming or at least some definition of business rules.

To help further, we can use additional services to discuss more over a phone call as you may not want to provide sensitive details in chstchat. Please let me so that we can work accordingly.

Thanks,

Leela

Related Programming Questions