How JustAnswer Works:
  • Ask an Expert
    Experts are full of valuable knowledge and are ready to help with any question. Credentials confirmed by a Fortune 500 verification firm.
  • Get a Professional Answer
    Via email, text message, or notification as you wait on our site.
    Ask follow up questions if you need to.
  • 100% Satisfaction Guarantee
    Rate the answer you receive.
Ask Alex Your Own Question
Alex
Alex, Engineer
Category: Homework
Satisfied Customers: 2718
Experience:  BS in Business Administration with a major in MIS. 15+ years experience in software design and development.
50749495
Type Your Homework Question Here...
Alex is online now
A new question is answered every 9 seconds

I have the following SQL statement for querying users (teachers)

This answer was rated:

I have the following SQL statement for querying users (teachers) and excluding any records that have the same first name, surname and school name. The query runs really slowly and am looking to make it faster and more efficient.

SELECT ggteachers.TCHR_NO, ggteachers.REFNO, ggteachers.TCHR_TITLE, ggteachers.TCHR_FNAME, ggteachers.TCHR_SNAME, ggteachers.EMAIL, ggteachers.PASSWORD, ggteachers.regdate, ggteachers.areyoua, ggteachers.areyouaother, ggteachers.marketing, ggteachers.newsletter, ggteachers.lastlogin, ggschools.scl_name, ggschools.refno, ggschools.street, ggschools.town, ggschools.county, ggschools.postcode, ggschools.la_name, ggschools.country, ggschools.tel_std, ggschools.tel_num, ggschools.regtypeid, ggteachers.lastLogin, ggteachers.statusid, ggteachers.regtypeid, ggteachers.logincount FROM ggteachers JOIN ggschools ON (ggteachers.refno=ggschools.refno) WHERE 1 = 1 AND (ggschools.phase = 'Primary' OR ggschools.phase = 'Middle Deemed Primary' OR ggschools.phase = 'Not applicable' OR ggschools.phase = 'PRU' OR ggschools.phase = 'Special School' OR ggteachers.areyoua = 'Primary Teaching Assistant/Learning Mentor' OR ggteachers.areyoua = 'Primary Teacher' OR ggteachers.areyoua = 'Supply Teacher' OR ggteachers.areyoua = 'Student Teacher' OR ggteachers.areyoua = 'Foundation Stage Teacher') AND la_name = ? AND country = ? GROUP BY tchr_sname, tchr_fname, ggschools.scl_name ORDER BY regdate DESC LIMIT 250 OFFSET 0
Hi. Thanks for your question.

I recommend the following changes (from easiest to hardest to implement):

1.
Change your WHERE clause from this:

WHERE 1 = 1 AND (ggschools.phase = 'Primary' OR ggschools.phase = 'Middle Deemed Primary' OR ggschools.phase = 'Not applicable' OR ggschools.phase = 'PRU' OR ggschools.phase = 'Special School' OR ggteachers.areyoua = 'Primary Teaching Assistant/Learning Mentor' OR ggteachers.areyoua = 'Primary Teacher' OR ggteachers.areyoua = 'Supply Teacher' OR ggteachers.areyoua = 'Student Teacher' OR ggteachers.areyoua = 'Foundation Stage Teacher') AND la_name = ? AND country = ?

to this:

WHERE (ggschools.phase IN ('Primary', 'Middle Deemed Primary', 'Not applicable', 'PRU', 'Special School') OR ggteachers.areyoua IN ('Primary Teaching Assistant/Learning Mentor', 'Primary Teacher', 'Supply Teacher', 'Student Teacher', 'Foundation Stage Teacher')) AND la_name = ? AND country = ?

Do you really mean OR above, or is it AND?

2. Add indexes (allowing duplicates) on ggschools.phase and ggteachers.areyoua if they don't exist.

If you don't have access to the database, the DBA should be able to do this change in a few minutes.

3.

Normalize your database by making a table called Phases that assigns a unique identifier to the phase types, and by making a table called TeacherType (aka Areyoua) that assigns a unique numeric identifier to each Teacher type. Then, in the Schools and Teachers tables, respectively, store only the ID instead of the text of each type. You may want to do the same for the Country field.

Then you can query against integers instead of text fields, which is far more efficient. Depending on your database size and complexity, this could take a fair amount of effort to implement, but is almost certain to solve the performance problems (as well as improving your database design overall).

http://databases.about.com/od/specificproducts/a/2nf.htm

Kind regards,
Susan
Customer: replied 4 years ago.

Hi


 


Thanks for getting back to me. I've had a look at what I sent and I've sent the wrong snippet of SQL. This one actually has the code to remove any records that have the same first name, surname and school name. Can you see how I would make this more efficient?


 


Cheers


 


Shaun


 


SELECT ggteachers.TCHR_NO, ggteachers.REFNO,
ggteachers.TCHR_TITLE, ggteachers.TCHR_FNAME, ggteachers.TCHR_SNAME,
ggteachers.EMAIL, ggteachers.PASSWORD, ggteachers.regdate, ggteachers.areyoua,
ggteachers.areyouaother, ggteachers.marketing, ggteachers.newsletter,
ggteachers.lastlogin, ggschools.scl_name, ggschools.refno, ggschools.street,
ggschools.town, ggschools.county, ggschools.postcode, ggschools.la_name,
ggschools.country, ggschools.tel_std, ggschools.tel_num, ggschools.regtypeid,
ggteachers.lastLogin, ggteachers.statusid, ggteachers.regtypeid,
ggteachers.logincount FROM ggteachers JOIN ggschools ON
(ggteachers.refno=ggschools.refno) JOIN ( SELECT tchr_no FROM ( SELECT DISTINCT
tchr_no, tchr_sname, tchr_fname, ggschools.scl_name, ggschools.postcode,
ggteachers.refno, MIN(DATE_FORMAT(regdate, '%Y-%m-01')) AS regdate FROM
ggteachers JOIN ggschools ON ggteachers.refno = ggschools.refno GROUP BY
tchr_sname, tchr_fname, ggschools.scl_name, ggschools.postcode ) AS teachers )
AS unique_teachers ON (unique_teachers.tchr_no=ggteachers.tchr_no) ORDER BY
regdate DESC LIMIT 250 OFFSET 0

Hi. Thanks for your question!

I'm not sure about this one. The normalization problems don't apply here, but it's a complex query. I'd probably add indexes on refno (in the schools and teachers table), school name and postcode in the schools table, and teacher number in the teachers table, if they don't exist.

Other than that, you could try your database system's query optimizer. That will determine the optimal query plan for ordering the uniques and joins.

If the query is really unacceptably slow, you could also try a staging table. That is: first run a make-table query to set up a temporary table with the unique teacher information (grouped), and then run a secondary select query against that. You'd want to set it up as a stored procedure so that the two processes would always run together. That's not the most elegant solution, but it can certainly improve performance.

Kind regards,
Susan

Customer: replied 4 years ago.
Relist: Answer quality.
I'm looking for someone to help rewrite the sql query to be more efficient. The help offered was information that I already know.
What DBS are you using?

Are you wanting to remove the duplicates or only show the duplicated values once?

Can you post the DDL to create your tables?

How many records are in those tables, need to determine if it an index issue or simply a problem with your query.

Thanks
Alex

Customer: replied 4 years ago.


Hi Alex


 


Thanks for getting in contact. It's MySQL 5.6. I want to remove all the duplicates from the query.


 


43,000 records


 


SET FOREIGN_KEY_CHECKS=0;


DROP TABLE IF EXISTS `schools`; CREATE TABLE `schools` ( `REFNO` int(11) NOT NULL AUTO_INCREMENT, `LA_NAME` varchar(255) DEFAULT NULL, `LA` int(11) DEFAULT NULL, `LA_NO` int(11) DEFAULT NULL, `areaid` int(11) DEFAULT '0', `DFES_NO` int(11) DEFAULT NULL, `SCL_NAME` varchar(255) DEFAULT NULL, `STREET` varchar(255) DEFAULT NULL, `LOCALITY` varchar(255) DEFAULT NULL, `ADDRESS_3` varchar(255) DEFAULT NULL, `TOWN` varchar(255) DEFAULT NULL, `COUNTY` varchar(255) DEFAULT NULL, `POSTCODE` varchar(12) DEFAULT NULL, `TEL_STD` varchar(11) DEFAULT NULL, `TEL_NUM` int(11) DEFAULT NULL, `PHASE` varchar(255) DEFAULT NULL, `COUNTRY` varchar(255) DEFAULT NULL, `countryid` int(11) DEFAULT '0', `regtypeid` int(1) DEFAULT '-1', `datecreated` datetime DEFAULT NULL, `datemodified` datetime DEFAULT NULL, PRIMARY KEY (`REFNO`), KEY `LA_NO` (`LA_NO`) USING BTREE, KEY `refno` (`REFNO`) USING BTREE, KEY `regtypeid` (`regtypeid`) USING BTREE, KEY `phase` (`PHASE`) ) ENGINE=InnoDB AUTO_INCREMENT=10024211 DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 23552 kB; InnoDB free: 13312 kB; InnoDB free: 2';


 


42,000 records


 


SET FOREIGN_KEY_CHECKS=0;


DROP TABLE IF EXISTS `teachers`; CREATE TABLE `teachers` ( `TCHR_NO` varchar(35) NOT NULL DEFAULT '0', `REFNO` int(11) DEFAULT NULL, `TCHR_TITLE` varchar(10) DEFAULT NULL, `TCHR_FNAME` varchar(20) DEFAULT NULL, `TCHR_SNAME` varchar(30) DEFAULT NULL, `EMAIL` varchar(100) DEFAULT NULL, `PASSWORD` varchar(255) DEFAULT NULL, `regdate` date DEFAULT NULL, `datemodified` datetime DEFAULT NULL, `areyoua` varchar(255) DEFAULT NULL, `areyouaother` varchar(255) DEFAULT NULL, `newsletter` int(1) NOT NULL DEFAULT '1', `accounttype` varchar(255) DEFAULT NULL, `marketing` int(1) NOT NULL DEFAULT '1', `regtypeid` int(1) DEFAULT NULL, `requpdate` int(11) DEFAULT '0', `logincount` int(11) DEFAULT '0', `lastlogin` datetime DEFAULT '0000-00-00 00:00:00', `statusid` int(11) NOT NULL DEFAULT '5', PRIMARY KEY (`TCHR_NO`), KEY `tchr_no` (`TCHR_NO`) USING BTREE, KEY `refno` (`REFNO`) USING BTREE, KEY `regtypeid` (`regtypeid`) USING BTREE, KEY `statusid` (`statusid`) USING BTREE, KEY `areyoua` (`areyoua`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='InnoDB free: 11264 kB; InnoDB free: 16384 kB; InnoDB free: 2';


 


Let me know if you need anything else.


 


Cheers


 


Shaun

Ok, I will look at this tonight.
THIS ANSWER IS LOCKED!

You need to spend $3 to view this post. Add Funds to your account and buy credits.
Alex, Engineer
Category: Homework
Satisfied Customers: 2718
Experience: BS in Business Administration with a major in MIS. 15+ years experience in software design and development.
Alex and 4 other Homework Specialists are ready to help you