Provider Attribution

This appears to be the most expensive and difficult query that we currently do. I (GS) am curious to understand the problem.

I don't even quite know what it is that needs to be done, but I have the code that attempts to do it, and some people have narrated the specification to me. So I start writing down what I know of the specifications. Then I think how to do it in principle based on those specifications. And thirdly I analyze the code that exists presently.

Specification

The system shall list for each patient a single primary care physician who appears to be the primary care provider for that patient.

The system shall take into consideration evidence of Patient Physician interactions. Evidence includes, but is not limited to:

  • Encounters
  • Orders
  • Claims

The system shall weigh that evidence so as to select the most likely primary care physician for the patient.

The system shall be able to laterally suppress evidence based on other evidence. For example, the presence of an inpatient encounter would suppress evidence of other data at the same time. [I am not quite clear why]

The system shall only select a physician who is identified as a "primary care physician" and not a hospitalist.

Sketchy Design

My conceptual model is of course the HL7 v3 RIM. In the RIM all the patient provider interactions are just Acts.

Patient - participation - Act - participation - Provider

Now if we start with just this (I know there are many additional things going on), the right approach is the following:

  1. with interaction as select patient, provider from join patient - act - provider
  2. with interaction_tally as select patient, provider, count(1) from interaction group by patient, provider
  3. with interaction_max as select patient, max(count) from interaction_tally group by patient
  4. with interaction_winner as select patient, provider from interaction_max inner join interaction_tally using(patient) where count = max

By the time I have explained this in plain English, I have written the SQL:

WITH Interaction AS (
  SELECT pat.id AS patientId, doc.id AS providerId 
    FROM Act act
      INNER JOIN Participation pat ON(pat.act_id = act.id)
      INNER JOIN Participation doc ON(doc.act_id = act.id)
), InteractionTally AS (
  SELECT patientId, providerId, count(1) AS count
    FROM Interaction
    GROUP BY patientId, providerId
), InteractionMax AS (
  SELECT patientId, max(count) as count 
    FROM InteractionTally
    GROUP BY patient
), InteractionWinner AS (
  SELECT patientId, providerId
    FROM InteractionMax 
      INNER JOIN InteractionTally using(patientId, count)
)

This is the skeleton. Everything else seems just extra sugar, like:

  • constraints and weighing
    • constrain the Participation.types (subject, patient, performer, author, etc.)
    • constrain the Act.code (encounter, substance administration order, etc.)
    • constrain provider role (requires additional join(s))
    • add an interaction weighing scheme
    • add time based weighing
  • include the join through Patient global id
  • add lateral inhibition of interactions (if that is really needed)

I think you'd like to drive the constraints as well as the weighing using one approach, because they are really the same thing. If you say, "an inpatient encounter is not counted", you are saying "the weight of the inpatient encounter is zero". The only reason for actually excluding things is in order to cut down on the size of the join.

Let's put some weighing / exclusion thing in using a Weighing parameter table. That table lists all the combinations of the relevant attributes and ascribes a weight to each of them. (It is a minor implementation detail whether you do that with one table or with the cross product of multiple weighing facet tables.)

WITH Interaction AS (
  SELECT pat.id AS patientId, doc.id AS providerId, w.value AS weight
    FROM Act act
      INNER JOIN Participation pat ON(pat.act_id = act.id)
      INNER JOIN Participation doc ON(doc.act_id = act.id)





      INNER JOIN Weighing w ON(
        actClassCode = act.classCode, 
        actMoodCode = act.moodCode,
        actCode = act.code,
        patPartTypeCode = pat.typeCode
        docPartTypeCode = doc.typeCode
        ...
      )
)

now once we have the weight pulled in, the rest is easy, all we change is turn count(1) count into sum(weight) weight:

WITH Interaction AS (
  SELECT pat.id AS patientId, doc.id AS providerId, w.value AS weight
    ...
), InteractionTally AS (
  SELECT patientId, providerId, sum(weight) AS weight
    FROM Interaction
    WHERE weight > weightCutoff
    GROUP BY patientId, providerId
), InteractionMax AS (
  SELECT patientId, max(weight) as weight
    FROM InteractionTally
    GROUP BY patient
), InteractionWinner AS (
  SELECT patientId, providerId
    FROM InteractionMax 
      INNER JOIN InteractionTally using(patientId, weight)
)

You can think of the Weighing table as something that can even be calibrated.

Time based weighing would be added by a function which reduces the weight over time. This can be a square-shaped function that is 1 for a certain window of time before now and 0 thereafter. Or it can be a monotonously decreasing function from 1 down to 0. The weighing table can set that, and

WITH Interaction AS (
  SELECT pat.id AS patientId, doc.id AS providerId, w.value * apply(w.decayFunction, now() - act.time) AS weight
    ...
)

so that's fine.

What is hard is the lateral suppression of interactions, i.e., when we start introducing dependency in the weighing algorithm. But then, the nice thing about it is that it is also just another way of doing the weights. And of course we know that adding up weights has a perfect probabilistic interpretation and really is the same idea as Fellegi Sunter. So if we want we can even estimate all that weight stuff using the EM method. But we don't have to, we can assign the weights heuristically.

The patient global index issue is just another pair of joins. No big deal in principle.

Oh, and of course the RMRS is not HL7 v3, so the whole Act and participation stuff will be a bit more complicated. But then it is not really introducing anything fundamentally new, all we do is map the various special classes and timestamps onto Acts like this:

WITH EncounterAct AS (
  SELECT 'ENCOUNTER' AS classCode, 'EVN' AS moodCode, encounter_id AS id, encounterTime AS time
    FROM Encounter
), OrderAct AS (
  SELECT 'ACT' AS classCode, moodCode, medical_order_id AS id, orderTime AS time
), Act AS (
  SELECT * FROM EncounterAct
  UNION ALL
  SELECT * FROM OrderAct
  ...
)

As a conclusion, unless I don't have all the requirements, I do have here a straight-forward approach to solving the problem in principle. Now (and only now!) we need to look into optimization issues.

Optimizing the Approach

I think it's too early to optimize because we haven't even run a test yet. But to plan a little bit what we might run up against we can do some calculations. The main Interaction table will be pretty big because we will include all Acts, that is, order of magnitude 109 items. Then we need to do a group by on that to have about 108 items in the interaction tally. Then that needs to pick the winner 107 sort and join again to the 108. Well, if that's what we have to do, then that's what we have to do. I don't see how just randomly tweaking things will change that.

Is there a principle which we can use to optimize the approach? I can't tell at this time.

Current Code

Enough from-scratch thought about this problem. Time to look at what we have as the present solution. I wish we could see the initial design. The very first design that this whole intricate code started out with.

We don't have that, so what I will do now is go through the current design commenting freely as I'm trying to make sense of it. I keep the complete unchanged code at the end. And I will copy fragments as I'm analyzing it.

Round 1: Outline

The code is a Oracle module with 5 PL/SQL procedures. So first I just put the procedures here with arguments and description comment, no code.

   PROCEDURE RUN_SELECT_PATIENTS(
                activity_years  IN INT,
                report_year     IN INT,
                report_month    IN INT
            ) ...

Patient search function to perform cohort selection before RUN_PCP_SEARCH (previously part of RUN_PCP_SEARCH).

Parameters:

  • activity_years - number of years old activity can be for qualifying patients in cohort. For example, 5 would indicate that anyone with activity in the last 5 years would be included in the patient cohort.
  • report_year - target year for reporting PCP activity, full Gregorian year (e.g., 2007)
  • report_month - target month for reporting PCP activity, number between 1 and 12 (Jan and Dec).

Lines of code: ~100

    PROCEDURE RUN_PCP_SEARCH(
                activity_years  IN INT,
                report_year     IN INT,
                report_month    IN INT
            ) ...

Main search function to perform entire PCP data gathering and calculations Run RUN_SELECT_PATIENTS first, same parameters as that.

Lines of code: ~300

Given that the remaining procedures are not documented and have quite specific names, I will assume that those are helper methods:

    PROCEDURE INPATIENT_LOOKUP(
                report_year    IN INT,
                report_month   IN INT
            ) ...

Used in RUN_PCP_SEARCH.

    PROCEDURE PHYSICIAN_LOOKUP(
                report_year    IN INT,
                report_month   IN INT
            ) ...

Used in RUN_PCP_SEARCH.

Now this last one is special, it's called multiple times also from RUN_PCP_SEARCH:

    PROCEDURE RUN_ACTIVITY_LOAD(
                activity_years        IN INT,
                report_year           IN INT,
                report_month          IN INT,
                target_institution_id IN INT
            ) ...

Loads data from PARTICIPATION into PCP_ACTIVITY.

Parameters:

  • activity_years - same as above
  • report_year - same as above
  • report_month - same as above
  • target_institution_id - institution to copy

Intriguing: PCP_ACTIVITY sounds a lot like "Act". Let's see what it is. Here is some info:

INSERT INTO PCP_ACTIVITY(
  institution_id,
  patient_id,
  participant_provider_id,
  participant_number,
  participant_number_sys_id,
  participation_time,
  usable_yn,
  act_type_code

well what is that?

  • patientId: composite of course (institution_id, patient_id)
  • providerId: (participant_provider_id) not sure if institution_id also factors in
  • (participant_number, participant_number_sys_id), some external id?
  • Act.time: (participation_time)
  • Act.code: (act_type_code)
  • Some filtering: usable_yn

O.k. so that means that a lot of work is spent on WRITING Act records from the database just because the database design didn't like the idea of HL7 v3 RIM Acts? What if we could avoid creating all those Act records and instead just queried existing ones? Just a thought.

Round 2: Digging In

We want to dig into the 2 main methods. They are long spaghetti procedures, a joy to look at. But then we can just show the skeleton of each. My main interest is the one that sets up the Act table, and that is the RUN_PCP_SEARCH procedure.

Let's just outline it to see the basic SQL statements, CURSORs and loops:

There are tons of things like that:

DBMS_OUTPUT.PUT_LINE('... some log message ...');
INSERT INTO preavis.out(msg, insert_time) 
  VALUES ('... same log message (typed again) ...'
          sysDate );
COMMIT;

that does not make the code-gourmet happy. Typing the same string twice for a log message is really a nice way of increasing redundancy, enhancing the possibility of inconsistencies, and it sure a lot of fun for maintaining the code.

But what is more concerning is that we COMMIT here. So whenever there is a log output we COMMIT. Whatever.

The first real step says in comment:

Populate intial physician activity log for patient-physician associations

  DELETE FROM PCP_ACTIVITY;
  COMMIT;

  PCP.RUN_ACTIVITY_LOAD( activity_years, report_year, report_month, 1 );
  PCP.RUN_ACTIVITY_LOAD( activity_years, report_year, report_month, 2 );
  PCP.RUN_ACTIVITY_LOAD( activity_years, report_year, report_month, 3 );
  ...

and on and on going through a bunch of institutions. That's simple enough to understand (although I have my eyebrows raised.)

The next step says:

Look for a match to a DEFining identifier for a LOCAL_PROVIDER

and goes:

  UPDATE PCP_ACTIVITY pal
     SET pal.participant_provider_id = (
       SELECT pi.person_id
         FROM person_identifier pi
        WHERE pal.participant_number = pi.identifier 
          AND pal.participant_number_sys_id = pi.identifier_sys_id
          AND pi.identifier_assignment_code = 'DEF'
     ) WHERE pal.participant_provider_id IS NULL
         AND pal.participant_global_id IS NULL
         AND pal.usable_yn = 'Y'
         AND pal.participant_number IS NOT NULL
         AND pal.participant_number_sys_id IS NOT NULL;
  COMMIT;

Hmm, an UPDATE is going to cause loads of logging. Reads through everything and assigns the original institution's identifier to that provider_id. Why could this not have been a simple join done when the Act table was first created? And why are we not using the Person table which I thought contained provider ids? Why is this special? Why don't we do the same Person_identifier correction for Patient?

This might make sense only if the usable_yn flag is 'N' most of the time so that we might save a lot of work. But then that could have been part of a join too. Questions, questions.

Next:

Now we look for a match to a REFining identifier for a LOCAL_PROVIDER. Note that we match to an arbitrary MIN(local_provider_id) in this case.

What, "refining"? What is being "refined" here? I thought DEF stands for DEFining and REF for referring? Am I off base or are we so deep in jargon that we forgot to express thought plainly?

   UPDATE PCP_ACTIVITY pal
      SET pal.participant_provider_id = (
        SELECT MIN(pi.person_id)
          FROM person_identifier pi
         WHERE pal.participant_number = pi.identifier
           AND pal.participant_number_sys_id = pi.identifier_sys_id
           AND pi.identifier_assignment_code = 'REF'
        ) WHERE pal.participant_provider_id IS NULL
            AND pal.participant_global_id IS NULL
            AND pal.usable_yn = 'Y'
            AND pal.participant_number IS NOT NULL
            AND pal.participant_number_sys_id IS NOT NULL;
   COMMIT;

This is almost the same thing as we just saw. Why the duplication? Aha, I assume it is to give precedence to DEFining identifiers and only second precedence to REFerring ones. But is it worth reading over 109 Acts twice just to fill in some missing identifiers?

Next:

Search for global_person_id for the physicians in the activity log

  PCP.PHYSICIAN_LOOKUP( report_year, report_month );
  COMMIT;

Oh, modular code. I see.

Next:

Mark the PARTICIPATIONS that occured during an INPATIENT visit.

   PCP.INPATIENT_LOOKUP( report_year, report_month );
   COMMIT;

more modular code.

Next:

Mark the PARTICIPATIONS between excluded doctor/patient combinations.

  UPDATE PCP_ACTIVITY a
     SET usable_yn = 'N',
         not_usable_reason = 'pcp_exclusion'
    WHERE EXISTS (
      SELECT 1
        FROM pcp_exclusion e
          INNER JOIN Person p ON(e.patient_global_id = p.global_person_id)
        WHERE a.participation_time <= e.effective_date
          AND a.participant_global_id=e.physician_global_id
          AND a.institution_id=p.institution_id
          AND a.patient_id = p.person_id
      )
      AND usable_yn='Y';
  COMMIT;

Hmm, so we really could not do this beforehand either? Every time we run this UPDATE we have to process all those Acts.

Next:

Calculate PCP for each patient, based on activity. First, we calculate PCP based on just last 18 months of activity. Next, we calculate PCP (for anyone not assinged a PCP in the 18 month pass) based on the entire 3 years worth of activity in PCP_ACTIVITY.

Looks like now it's getting hot, could it be that this is our very result table?

  INSERT INTO PCP_PROVIDER_SUMMARY ( 
        evaluation_year,
        evaluation_month,
        global_person_id,
        participant_global_id,
        participation_count,
        participation_day_count,
        participation_time_latest,
        participation_time_earliest,
        provider_freq_ran
      )
     SELECT report_year, 
            report_month,
            ppl.global_person_id,
            pal.participant_global_id,
            count(pal.participation_time) as participation_count,
            count(distinct pal.participation_time) as participation_day_count,
            max(pal.participation_time) as participation_time_latest,
            min(pal.participation_time) as participation_time_earliest,
            RANK() OVER (
              PARTITION BY ppl.global_person_id
                  ORDER BY count(distinct pal.participation_time) DESC,
                           max(participation_time) DESC
                           min(participation_time),
                           count(pal.participation_time) DESC
            ) as rank
       FROM PCP_ACTIVITY pal
         INNER JOIN pcp_patient ppl 
           ON(    ppl.institution_id = pal.institution_id 
              AND ppl.patient_id = pal.patient_id
              AND ppl.evaluation_year = report_year
              AND ppl.evaluation_month = report_month)
      WHERE pal.participant_global_id IS NOT NULL 
        AND usable_yn='Y'
        AND pal.participation_time >= (RPT_DATE-((365*activity_years)/2))
        AND ppl.origin_code = 'PATIENT'
      GROUP BY ppl.global_person_id, pal.participant_global_id;
  COMMIT;

This feels a lot like that ranking and weighing going on here. And there is a second INSERT statement just like the first immediately after. I assume this could have been an INSERT ... SELECT ... UNION ALL SELECT ... so now

     SELECT report_year,
            report_month,
            ppl.global_person_id,
            pal.participant_global_id,
            count(pal.participation_time) as participation_count,
            count(distinct pal.participation_time) as participation_day_count,
            max(pal.participation_time) as participation_time_latest,
            min(pal.participation_time) as participation_time_earliest,
            RANK() OVER (
              PARTITION BY ppl.global_person_id
                  ORDER BY count(distinct pal.participation_time) DESC,
                           max(participation_time) DESC,
                           min(participation_time),
                           count(pal.participation_time) DESC
              ) as rank
       FROM PCP_ACTIVITY pal
         INNER JOIN pcp_patient ppl
           ON(    ppl.institution_id = pal.institution_id
              AND ppl.patient_id = pal.patient_id
              AND ppl.evaluation_year = report_year
              AND ppl.evaluation_month = report_month)
      WHERE pal.participant_global_id IS NOT NULL
        AND pal.usable_yn='Y'
        AND pal.participation_time >= (RPT_DATE-((365*activity_years)))
        AND (    pal.participation_time >= (RPT_DATE-((30))) 
             AND pal.participation_time < RPT_DATE)
        AND ppl.global_person_id NOT IN (
          SELECT distinct(pps.global_person_id)
            FROM pcp_provider_summary pps
           WHERE pps.evaluation_year = report_year
             AND pps.evaluation_month = report_month
          )
        AND ppl.origin_code='PATIENT'
      GROUP BY ppl.global_person_id,pal.participant_global_id;
  COMMIT;

But we're not done, this one now is the last join of the max to the tally:

Find most commonly occuring physician for each patient

  UPDATE PCP_PROVIDER_SUMMARY pps
     SET pps.pcp_assigned_yn='Y'
     WHERE provider_freq_rank=1
       AND evaluation_year=report_year
       AND evaluation_month=report_month;
  COMMIT;

Hmm, o.k. so that was the gist of it.

I am concluding for today with 2 questions:

  • Has this approach been chosen after or before trying a more straight-forward logical design?
  • Where are we currently spending most of our resources? Is there any analysis of this as to where most of the CPU and IO is spent?
  • Can we see a log file with that output so we can perhaps guess the answers? The logfile would also help with the cardinalities to understand how much data we are shuffling around while creating all these intermediary tables and updating it over and over again.

Procedure: PHYSICIAN_LOOKUP

According to John Hook, this procedure is one of the more costly ones.

This procedure is called once from RUN_PCP_SEARCH main procedure.

To understand that we need to wade through a huge amount of variable declarations:

PROCEDURE PHYSICIAN_LOOKUP(
       report_year    IN INT,
       report_month    IN INT
     ) IS

  UPD_CNT  NUMBER :=    0;
  I        NUMBER :=    0;
  LKP_CNT  NUMBER :=    0;

  TARGET_PROVIDER_ID LOCAL_PROVIDER.local_provider_id%TYPE := 0;
  PREV_PROVIDER_ID   LOCAL_PROVIDER.local_provider_id%TYPE := 0; -- DEAD

  PCP_MIN_FLAG       LOCAL_PROVIDER.pcp_yn%TYPE; 
  PCP_MAX_FLAG       LOCAL_PROVIDER.pcp_yn%TYPE;
  INST_ID            LOCAL_PROVIDER.institution_id%TYPE    := 0; -- DEAD

  PART_TIME          PCP_ACTIVITY.participation_time%TYPE; -- DEAD
  GLOB_ID            PERSON.global_person_id%TYPE          := 0;
  PREV_GLOB_ID       PERSON.global_person_id%TYPE          := 0; -- DEAD

  TYPE prov_id_list IS TABLE OF
     PCP_ACTIVITY.participant_provider_id%TYPE INDEX BY BINARY_INTEGER;

  PROV_IDS           prov_id_list;
  PROV_IDS_CNT       NUMBER := 0;

  RPT_DATE DATE :=
     add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
           to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);

Then comes a bit of comment and we begin seeing some SQL:

This CURSOR is used to read the PCP_ACTIVITY rows where we have identified a provider/doctor but have not yet connected it to the global physician occurences or checked the validity of the doctor for being a PCP.

  CURSOR pcp_activity_read IS
    SELECT participant_provider_id,participation_time
      FROM pcp_activity
     WHERE participant_global_id IS NULL
       AND participant_provider_id IS NOT NULL
       AND usable_yn='Y'
     ORDER BY participant_provider_id,participation_time
     FOR UPDATE;

BEGIN

The strange thing about the cursor named "pcp_activity_read" is that this is never referenced anywhere. Unless I am completely off base about PL/SQL, this is a piece of dead code. So I won't bother understanding it.

Now the real stuff starts. There are a lot of "SELECT ... INTO ..." calls that stuff data into arrays.

  SELECT pa.participant_provider_id
    BULK COLLECT INTO PROV_IDS
    FROM pcp_activity pa
   WHERE pa.participant_provider_id IS NOT NULL
     AND pa.usable_yn='Y'
   GROUP BY pa.participant_provider_id;

    PROV_IDS_CNT := PROV_IDS.COUNT;

and be reminded that:

  TYPE prov_id_list IS TABLE OF
     PCP_ACTIVITY.participant_provider_id%TYPE INDEX BY BINARY_INTEGER;
  PROV_IDS     prov_id_list;
  PROV_IDS_CNT NUMBER := 0;

So now this is going through all the interaction table to find the providerIds. Why? Aha:

    LKP_CNT := 0;
    UPD_CNT := 0;

    FOR I IN 1..PROV_IDS_CNT
    LOOP 
      TARGET_PROVIDER_ID := PROV_IDS(I);
    BEGIN

so we are now looping over each such provider id. Let's see what we're trying to do with them:

      SELECT institution_id
        INTO INST_ID
        FROM local_provider
       WHERE local_provider_id = TARGET_PROVIDER_ID;

hmm, isn't LOCAL_PROVIDER_ID the primary key of LOCAL_PROVIDER? If so, why have we not just slapped this as a JOIN when we first built that Interaction table? Also, take note that this INST_ID is used only once, far down and away from here. No idea what the logic is of finding this once here and then using it miles away from here. Minor issue. [Added later: if it wasn't for the fact that this is just dead code for a useless INSERT at the end]

Next:

      SELECT p1.global_person_ID, 
             MIN(lp1.pcp_yn),
             MAX(lp1.pcp_yn)
        INTO GLOB_ID, PCP_MIN_FLAG, PCP_MAX_FLAG
        FROM LOCAL_PROVIDER lp -- WHERE lp.local_provider_id = TARGET_PROVIDER_ID
          INNER JOIN PERSON per
            ON(    per.institution_id = lp.institution_id
               AND per.person_id = lp.local_provider_id)
          INNER JOIN PERSON p1
            ON(p1.global_person_id = per.global_person_id)
          LEFT OUTER JOIN LOCAL_PROVIDER lp1
            ON(    lp1.institution_id = p1.institution_id
               AND lp1.local_provider_id = p1.person_id)
          WHERE lp.local_provider_id = TARGET_PROVIDER_ID
            AND (   lp1.retirement_time IS NULL
                 OR lp1.retirement_time >= (RPT_DATE-365))
          GROUP BY p1.global_person_id;

hmm, this now gets all the provider records for this global person id, and shows what all those records think if the provider is a primary care provider or not. But this query is so peculiarly complex! 3 joins, twice to Person and a left outer join to LOCAL_PROVIDER again.

It looks like we want to look at all the pairs of

Provider><Person [globalPersonId] Person><Provider

that regard the same global person id. But the first is related to the local provider id which is the current item in the big FOR loop.

Why are we doing this? Because it looks like we simply want all the records of this person and see if any of them claims the provider as a primary care provider.

Next it comments:

ELSE try to read the global provider id value

  EXCEPTION WHEN NO_DATA_FOUND THEN GLOB_ID := 0; END;

Now that's funny, in the last query there was a LEFT OUTER JOIN and I wondered why that was. The only reason is that we always want something back. But GLOB_ID is NULL if the LEFT OUTER JOIN has nothing to join. So that means the query above could have just been an inner join. Also note that GLOB_ID = 0 surely is trying to use a normal domain value as a NULL flag.

The story evolves thus:

  LKP_CNT := LKP_CNT + 1;

  IF GLOB_ID = 0  -- IF no global id (should not happen)
  THEN 
    UPDATE pcp_activity
       SET usable_yn='N', -- mark activity as unusable in PCP calc
           not_usable_reason='id'
     WHERE participant_provider_id = TARGET_PROVIDER_ID
       AND participant_global_id IS NULL;

what?? We already have a participant_global_id? There seems to be a very simple piece of reasoning all burried in a bunch of nifty codes which depends on intricate understanding of PL/SQL.

Next it says:

ELSE IF physician cannot be a PCP

  ELSE IF    PCP_MAX_FLAG='N' 
          OR (    PCP_MAX_FLAG='U'
              AND PCP_MIN_FLAG='N')

    THEN
      UPDATE pcp_activity
         SET usable_yn = 'N', -- mark activity unusable for PCP calc
             not_usable_reason = 'spec'
       WHERE participant_provider_id = TARGET_PROVIDER_ID
         AND participant_global_id IS NULL;

    ELSE
      UPDATE pcp_activity -- save possible PCP's global id 
         SET participant_global_id = GLOB_ID 
       WHERE participant_provider_id = TARGET_PROVIDER_ID
         AND participant_global_id IS NULL;
 
      INSERT INTO pcp_patient (
           evaluation_year, 
           evaluation_month,
           institution_id, 
           patient_id, 
           global_person_id,
           origin_code
        ) VALUES (
           report_year, 
           report_month,
           INST_ID,
           TARGET_PROVIDER_ID, 
           GLOB_ID,
           'LOCAL_PROVIDER'
        );
    END IF;
  END IF;

  UPD_CNT := UPD_CNT + 1;
END LOOP; -- the for-each PROV_ID loop
END;

So, here it ends. Hmm, what did all of this actually accomplish?

  • Interaction usable_yn was set to N if there is no provider record knows the provider as a primary care provider
  • Interaction table gets a provider global person id assigned
  • a row is inserted into a table called PCP_PATIENT

it strikes me that the first two points could have easily been done with a join when the Interaction tally is first built. The last bullet is some global side-effect which I have no idea yet how to think of. Whatever PCP_PATIENT is used later to be joined to the final query for building the results relations. But why in the world would we insert a PCP_PATIENT row if we are looking at PROVIDER_id? This totally makes no sense, unless PCP_PATIENT table somehow morphed in purpose. I have tried to confirm that, but I can't. PCP_PATIENT seems to be joined only with patient_id = some patient_id, not with a provider_id. So, sticking a provider_id into patient_id seems dead useless code.

How would I do this work here? I would never want to write code like this, I get high blood pressure and anxiety when I have to touch such code.

The input is PCP_ACTIVITY, the output is PCP_ACTIVITY with modifications. The main decision point is to look for provider records with same global person id and only allow for providers that have at least one PCP_YN set to 'Y' (this is nowhere stated, but all that other indirect stuff makes me conclude that I'm right.) This seems simple enough:

CREATE TABLE PCP_ACTIVITY_2 AS
SELECT act.institution_id, 
       act.patient_id,
       act.participant_provider_id,
       act.participant_number,
       act.participant_number_sys_id,
       act.participation_time,
       MAX(usable_yn) AS usable_yn, -- this is a good logical thing to do and works if N and U < Y
       act.act_type_code
  FROM PCP_ACTIVITY act
  INNER JOIN Person thisPerson
    ON(    thisPerson.person_id = act.participant_provider_id
       AND thisPerson.institution_id = act.institution_id)
  INNER JOIN Person otherPerson
    ON(otherPerson.global_person_id = thisPerson.global_person_id)
  INNER JOIN Local_Provider otherProvider
    ON(    otherProvider.local_provider_id = otherPerson.person_id
       AND otherProvider.institution_id = otherPerson.institution_id)
  WHERE usable_yn <> 'N'
  GROUP BY 
       act.institution_id, 
       act.patient_id,
       act.participant_provider_id,
       act.participant_number,
       act.participant_number_sys_id,
       act.participation_time,
       act.act_type_code

but then also, what's the point of carrying along Interaction records that are marked as "not to be used"? Why don't we just throw them away? In that case it gets really much easier:

CREATE TABLE PCP_ACTIVITY_2 AS
SELECT act.institution_id, 
       act.patient_id,
       act.participant_provider_id,
       act.participant_number,
       act.participant_number_sys_id,
       act.participation_time,
       act.act_type_code
  FROM PCP_ACTIVITY act
  WHERE usable_yn <> 'N'
    AND EXISTS (
      SELECT 1 
        FROM Person thisPerson
        INNER JOIN Person otherPerson
          ON(otherPerson.global_person_id = thisPerson.global_person_id)
        INNER JOIN Local_Provider otherProvider
          ON(    otherProvider.local_provider_id = otherPerson.person_id
             AND otherProvider.institution_id = otherPerson.institution_id)
        WHERE thisPerson.person_id = act.participant_provider_id
          AND thisPerson.institution_id = act.institution_id
          AND otherProvider.pcp_yn = 'Y'
      )

See, now we don't even need to do a GROUP by. This is also easy enough of a constraint to simply add to all the other filtering of the Interaction table.

So, then again I wonder: how do we know that the complex procedure with the many updates is better than a single query that does this? Of course I would not run this query like that, I would integrate it with all the other filters over the Interaction table.

For today I conclude with two questions:

  • please remove the CURSOR definition in the beginning of this procedure, this seems to have no use and might still cause useless database action.
  • please remove the INSERT into PCP_PATIENT piece at the end, it seems useless work.
    • also remove everything dealing with INST_ID, it too seems useless code
    • look at the variable declarations marked -- DEAD above and delete them

None of these two steps will likely make this all blazingly fast, but there is no point of having dead code around when trying to get something to work.

Now that I have rewritten the whole thing into a straight filter query without even GROUP BY left, I know that my approach is better than all that UPDATE. (Recall also that John Hook is proposing similar things: avoid all that UPDATE stuff)

Procedure: INPATIENT_LOOKUP

This puzzle has possessed me now. I can't stop. Moving on with this INPATIENT_LOOKUP procedure.

PROCEDURE INPATIENT_LOOKUP(
      report_year    IN INT,
      report_month    IN INT
    ) IS

  TYPE time_list IS 
     TABLE OF encounter_account.admit_time%TYPE
     INDEX BY BINARY_INTEGER;

  ADMIT_TIMES        time_list;
  DISCHARGE_TIMES    time_list;
  TIME_LIST_CNT      NUMBER := 0;
  UPD_CNT            NUMBER := 0;
  I                  NUMBER := 0;
  CNT                NUMBER := 0;
  LKP_CNT            NUMBER :=    0;
  GLOBAL_ID          PERSON.global_person_id%TYPE := 0;
  PREV_GLOBAL_ID     PERSON.global_person_id%TYPE := 0;

  ACTIVITY_TIME     ENCOUNTER_ACCOUNT.admit_time%TYPE := NULL;
  PA_ROWID          UROWID;

and here again a cursor with the first SQL:

  CURSOR pcp_act_read is
    SELECT pp.global_person_id,
           pa.participation_time,
           pa.rowid
      FROM pcp_activity pa
        INNER JOIN pcp_patient pp
          ON(    pp.evaluation_year = report_year 
             AND pp.evaluation_month = report_month
             AND pp.institution_id = pa.institution_id
             AND pp.patient_id = pa.patient_id)
      WHERE pa.participant_global_id IS NOT NULL 
        AND pa.usable_yn = 'Y'
      ORDER BY pp.global_person_id, pa.participation_time;

BEGIN
  OPEN    pcp_act_read;
  LOOP
    FETCH pcp_act_read 
     INTO GLOBAL_ID, ACTIVITY_TIME, PA_ROWID;
     EXIT WHEN pcp_act_read%NOTFOUND;
     CNT := CNT + 1;

so that's just another way of doing the FOR loop from the provider thing above. Our input data is again all Interaction records. Joining with special Patient records. So not only do we copy all Acts, we also have copied all Patients before we start all that expensive work.

Now I don't promote more hacks upon more hacks, but if we need a short term fix, here is what we can do:

  • write all that PCP_ACTIVITY and PCP_PATIENT thing as well as the LOCAL_PROVIDERS into an EXTERNAL datafile on the SAN.
  • use a 2nd Oracle instance on a separate machine to mount these datafiles and do all expensive work thus freeing up Condor for actual use.

I'll not spend further time on that, because I'm more intesteded in proving that straight-forward SQL will be faster than all that.

So, what is the work here?

    IF GLOBAL_ID != PREV_GLOBAL_ID
    THEN 
      IF TIME_LIST_CNT > 0
      THEN TIME_LIST_CNT := 0;
        ADMIT_TIMES.DELETE;
        DISCHARGE_TIMES.DELETE;
      END IF;

so there is something about going down an ordered list emulating some sort of group by global id operation. This here is what we want to do once per group:

      SELECT trunc(e.admit_time),
             DECODE(e.discharge_time,
                    NULL,
                    trunc(e.admit_time+1),
                    trunc(e.discharge_time+1))
        BULK COLLECT
        INTO ADMIT_TIMES, DISCHARGE_TIMES
        FROM encounter_account e
          INNER JOIN PCP_PATIENT a
            ON(    a.evaluation_year = report_year
               AND a.evaluation_month = report_month
               AND a.global_person_id = GLOBAL_ID
               AND a.institution_id = e.institution_id
               AND a.patient_id = e.patient_id
        WHERE (e.care_setting_sys_id, e.care_setting_code) IN (
                  SELECT target_sys_id,target_code
                    FROM concept_link
                   WHERE source_sys_id = 334 
                     AND source_code = 'INPATIENT_CARE_SETTINGS'
                     AND type_code = 'EL' 
                     AND type_sys_id = 4
                   GROUP BY target_sys_id, target_code
                 )
          AND e.admit_time IS NOT NULL;

So we're looking for encounters. But whose encounters? The one of any patient with the current GLOBAL_ID. What kind of encounter? One whose "care_setting" concept is an element in a list (EL) named INPATIENT_CARE_SETTINGS. But I have no idea why we are doing this with an IN subselect and a nested group by. Why not just make a straight forward INNER JOIN:

        INNER JOIN concept_link
          ON(    target_sys_id = e.care_setting_sys_id
             AND target_code = e.care_setting_code
             AND source_sys_id = 334 
             AND source_code = 'INPATIENT_CARE_SETTINGS'
             AND type_code = 'EL' 
             AND type_sys_id = 4)

or just to be sure we're not duplicating rows, it should be an EXISTS subselect:

        WHERE EXISTS (SELECT 1
            FROM concept_link
            WHERE target_sys_id = e.care_setting_sys_id
              AND target_code = e.care_setting_code
              AND source_sys_id = 334 
              AND source_code = 'INPATIENT_CARE_SETTINGS'
              AND type_code = 'EL' 
              AND type_sys_id = 4
          )

that's much clearer.

So what we have now is two parallel arrays of inpatient encounters for each of the patients. And because we're looping over a list of Interactions ordered by global patient id, we naturally want to do this searching for inpatient encounters only once.

Now comes the conclusion of the work that we wanted to do once per group.

      TIME_LIST_CNT := ADMIT_TIMES.COUNT;
      PREV_GLOBAL_ID := GLOBAL_ID;
      LKP_CNT := LKP_CNT + 1;
    END IF;

After this END IF, supposedly we are doing things for every record.

   IF ACTIVITY_TIME IS NULL
   THEN 
     UPDATE pcp_activity
        SET usable_yn = 'N',
            not_usable_reason = 'time'
        WHERE rowid = PA_ROWID;
     UPD_CNT := UPD_CNT + 1;

wait a minute? If there is no ACTIVITY_TIME, say because there was no Encounter for this patient, then we say that the Interaction is not usable? That's not fair. It should just leave things alone. Either this should never happen, or it should just not cause any such UPDATE.

Now here is what happens most of the time I suppose:

    ELSE 
      FOR I IN 1..TIME_LIST_CNT
      LOOP
        IF     ACTIVITY_TIME >= ADMIT_TIMES(I) 
           AND ACTIVITY_TIME <= DISCHARGE_TIMES(I)
        THEN 
          UPDATE pcp_activity
             SET usable_yn='N', 
                 not_usable_reason='inp'
             WHERE rowid = PA_ROWID;

          UPD_CNT := UPD_CNT + 1;
          EXIT;
        END IF;
      END LOOP;
    END IF;
  END LOOP;

Oh, that was it? So, this appears to be this lateral inhibition situation. That is, if there is an inpatient encounter contemporaneously with another Interaction, then we will disregard that other Interaction.

Two questions for me:

  1. how would I simplify this work here,
  2. how would I implement this lateral inhibition instead?

To the first question: I would consider this again an input/output transformation:

CREATE TABLE PCP_ACTIVITY_2 AS
SELECT act.institution_id, 
       act.patient_id,
       act.participant_provider_id,
       act.participant_number,
       act.participant_number_sys_id,
       act.participation_time,
       act.act_type_code
  FROM PCP_ACTIVITY act
  WHERE usable_yn <> 'N'
    AND NOT EXISTS ( -- any inpatient encounter for this patient at the same time
      SELECT 1 
        FROM Person thisPerson -- WHERE thisPerson = act.(patient_id, institution_id)
        -- standard 3-hop through global person id to other patient records
        INNER JOIN Person otherPerson
          ON(otherPerson.global_person_id = thisPerson.global_person_id)
        INNER JOIN Patient otherPatient
          ON(    otherPatient.patient_id = otherPerson.person_id
             AND otherPatient.institution_id = otherPerson.institution_id)
        -- pick up any encounter
        INNER JOIN Encounter_account encounter
          ON(    encounter.patient_id = otherPatient.patient_id
             AND encounter.institution_id = otherPatient.institution_id)
        -- pick up the inpatient encounters, notice, we don't care about dupes here, it's NOT EXIST
        INNER JOIN concept_link
          ON(    target_sys_id = e.care_setting_sys_id
             AND target_code = e.care_setting_code
             AND source_sys_id = 334 
             AND source_code = 'INPATIENT_CARE_SETTINGS'
             AND type_code = 'EL' 
             AND type_sys_id = 4)        
        -- arguments last:
        WHERE thisPerson.person_id = act.patient_id
          AND thisPerson.institution_id = act.institution_id
          AND act.time BETWEEN encounter.admission_time AND encounter.discharge_time
      )

and that's it. So, while I am talking about "lateral inhibition" here we didn't really care about that. What we did here was just go out to more data and see if something is there that we don't like. So, that's really all it is. Of course these Inpatient encounters are also Acts. I don't understand why these would completely block any other Interactions. Can't I have a visit from my PCP when I am in the hospital? I don't want to spend time on the lateral inhibition if really it may not be necessary.

Anyway, the query above would do it.

Procedure: RUN_SELECT_PATIENTS

So now we just one procedure left, and that's of interest because it defines what all goes into the Interaction table.

PROCEDURE RUN_SELECT_PATIENTS(
    activity_years IN INT,
    report_year    IN INT,
    report_month    IN INT
  ) IS

  RPT_DATE DATE := add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
                   to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);
               
BEGIN
  DELETE FROM pcp_patient 
   WHERE evaluation_year = report_year
     AND evaluation_month = report_month;

why not just drop a table? This DELETE stuff is so expensive!

  INSERT INTO PCP_PATIENT (
      evaluation_year, 
      evaluation_month,
      institution_id,
      patient_id, 
      global_person_id,
      origin_code)
  WITH cohort_list AS (    
    SELECT pat.institution_id,
           pat.patient_id,
           p.global_person_id,
           pat.registration_date,
           pat.registration_end_date,
           enc.max_admit,
           pstat.last_activity_date,
           p.origin_code
      FROM PERSON p
        INNER JOIN PATIENT pat
          ON(    pat.institution_id=p.institution_id 
             AND pat.patient_id=p.person_id)
        LEFT OUTER JOIN ENCOUNTER_STATISTICS enc
          ON(    enc.institution_id=pat.institution_id
             AND enc.patient_id=pat.patient_id)
        LEFT OUTER JOIN PATIENT_STATISTICS pstat
          ON(    pstat.institution_id=pat.institution_id
             AND pstat.patient_id=pat.patient_id)
        INNER JOIN (
            SELECT p1.global_person_id
              FROM PERSON p1
             WHERE p1.date_of_birth < (RPT_DATE-180) 
               AND p1.date_of_birth > (RPT_DATE-(365*100))
               AND p1.human_yn='Y'
               AND p1.origin_code='PATIENT'
               AND p1.institution_id IN (1,2,3,7,9,11,14,102,103,108,109,110)
               AND p1.name_last IS NOT NULL
               AND p1.name_first IS NOT NULL
               AND p1.home_address_zip_code_code IN (
                   SELECT z.zip_code 
                     FROM zip_code z 
                     WHERE z.zip_code LIKE '46%' 
                       AND z.county_name IN ('BOONE','HAMILTON','HANCOCK', 'HENDRICKS','JOHNSON', 'MADISON','MARION','MORGAN','PUTNAM')
                 )
            MINUS
            SELECT p2.global_person_id
              FROM PERSON p2
              WHERE p2.human_yn='N' 
                 OR p2.date_of_death < (RPT_DATE-180)
            ) persons 
          ON(persons.global_person_id = p.global_person_id)
          WHERE p.institution_id IN (1,2,3,7,9,11,14,102,103,108,109,110)
   )

   SELECT report_year, 
          report_month,
          co.institution_id,
          co.patient_id,
          co.global_person_id,
          co.origin_code
     FROM cohort_list co
     WHERE co.global_person_id IN (
       SELECT co1.global_person_id
         FROM cohort_list co1
         WHERE co1.registration_end_date >= (RPT_DATE-(365*activity_years))
            OR co1.last_activity_date    >= (RPT_DATE-(365*activity_years))
            OR co1.max_admit             >= (RPT_DATE-(365*activity_years))
            OR co1.registration_date     >= (RPT_DATE-(365*activity_years))
  );

So that's all. If I ruled I would take the form col1 IN (SELECT col2 FROM ...) away from SQL. I have never seen it do any good. Really these here are just a bunch of suppression rules to get rid of patients that are dead or don't live here. Nothing special.

Conclusion

So, I have gone through all the 4 main procedures. I believe that all cases of UPDATE and DELETE should be revised and replaced with simple filter queries. I think the PCP provider and the INPATIENT filter can both be executed in the same big query that filters the Interactions. But I too am wounded from my endless fights against the Oracle query optimizer. Still, DELETEs and UPDATES can never be better.

For a quick fix, because we're already writing the Interaction table and a special patient table, we could write those to a temporary transportable tablespace and then ship it off to a second Oracle instance to do the main crunching. That would be faster in any case. However, it would also be a bit more complex to manage. But not worse than what I'm hearing happens now.

I need to come back to the main query, to rewrite it to use my 4-part query with

  1. Interaction,
  2. InteractionTally?,
  3. InteractionMax? and
  4. InteractionWinner?.

From now on it becomes really tedious work because of all that special stuff in the RMRS data model. I am pleased that my Participation design appears to be well accepted these days. I wish the Acts were also here.

Nevertheless, with a view that exposes all these interactions as acts, we could make a UNION of all of them. It's not impossible.

Once we have the Act table, I would join onto it the global person ids of all participants right away. It seems to me those are needed a lot.

Determining the PCP status of providers ought to be a good thing to do separately of this, because there aren't so many providers that this can't be taking longer than an hour. With a simple provider PCP status table, the filtering of Interactions with non-PCPs is no big deal.

Doing the PCP_PATIENT up front seemed to be a good idea. Just that I would prefer to have replicate entries in there by local institution id. Because the heavy work is to find all those Interactions, so better have the join from Person to Patient pre-computed in the PCP_PATIENT table.

I would do no UPDATES at all ever. And no DELETE ever. Instead CREATE TABLE and DROP.


Original Unchanged Code

create or replace
PACKAGE BODY                          PCP AS

    PROCEDURE RUN_SELECT_PATIENTS(
                activity_years    IN INT,
                report_year    IN INT,
                report_month    IN INT
            ) IS
    /**    Patient search function to perform cohort selection before
RUN_PCP_SEARCH (previously part of RUN_PCP_SEARCH)
        Parameters:
            activity_years        - numb of years old activity can be
for qualifying patients in cohort
                          For example, "5" would indicate that anyone
with activity in the last 5 years
                          would be included in the patient cohort.
            report_year,
            report_month        - target month for reporting pcp activity
                          For example, year should be in format like "2007"
                          while month is number between 1 and 12 (Jan
and Dec)
    **/
                RPT_DATE DATE :=
add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);
               
        BEGIN

            /**    Populate the table with PATIENTs for which PCP should
be gathered
            **/
            DELETE FROM pcp_patient WHERE evaluation_year = report_year

                AND evaluation_month = report_month;

            INSERT INTO
PCP_PATIENT(evaluation_year,evaluation_month,institution_id,patient_id,global_person_id,origin_code)
            WITH cohort_list AS
            (    SELECT    pat.institution_id
                ,pat.patient_id
                ,p.global_person_id
                ,pat.registration_date
                ,pat.registration_end_date
                ,enc.max_admit
                ,pstat.last_activity_date
                        ,p.origin_code
                FROM PERSON p
                INNER JOIN PATIENT pat on
pat.institution_id=p.institution_id AND pat.patient_id=p.person_id
                LEFT OUTER JOIN rmrs.ENCOUNTER_STATISTICS enc
                    ON enc.institution_id=pat.institution_id AND
enc.patient_id=pat.patient_id
                LEFT OUTER JOIN rmrs.PATIENT_STATISTICS pstat
                    ON pstat.institution_id=pat.institution_id AND
pstat.patient_id=pat.patient_id
                INNER JOIN
                    (SELECT p1.global_person_id
                     FROM PERSON p1
                     WHERE p1.date_of_birth < (RPT_DATE-180) and
p1.date_of_birth > (RPT_DATE-(365*100))
                     AND p1.human_yn='Y'
                     AND p1.origin_code='PATIENT'
                     AND p1.institution_id IN
(1,2,3,7,9,11,14,102,103,108,109,110)
                     AND p1.name_last IS NOT NULL AND p1.name_first IS
NOT NULL
                     AND p1.home_address_zip_code_code IN
                    ( SELECT z.zip_code FROM zip_code z WHERE z.zip_code
LIKE '46%' AND z.county_name IN
                      ('BOONE','HAMILTON','HANCOCK','HENDRICKS','JOHNSON',
                       'MADISON','MARION','MORGAN','PUTNAM'
                      )
                    )
                 MINUS
                 SELECT p2.global_person_id
                 FROM PERSON p2
                 WHERE p2.human_yn='N' OR p2.date_of_death < (RPT_DATE-180)
                ) persons ON persons.global_person_id=p.global_person_id
                WHERE p.institution_id IN
(1,2,3,7,9,11,14,102,103,108,109,110)
            )
            SELECT report_year, report_month,
co.institution_id,co.patient_id,co.global_person_id,co.origin_code
            FROM cohort_list co
            WHERE co.global_person_id IN
            (    SELECT            co1.global_person_id
                FROM cohort_list    co1
                WHERE       co1.registration_end_date >=
(RPT_DATE-(365*activity_years))
                OR co1.last_activity_date    >=
(RPT_DATE-(365*activity_years))
                OR co1.max_admit             >=
(RPT_DATE-(365*activity_years))
                OR co1.registration_date     >=
(RPT_DATE-(365*activity_years))
            );
            COMMIT;

            DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- END' );
        END;


    PROCEDURE RUN_PCP_SEARCH(
                activity_years    IN INT,
                report_year    IN INT,
                report_month    IN INT
            ) IS
    /**    Main search function to perform entire PCP data gathering and
calculations (run RUN_SELECT_PATIENTS first)
        Parameters:
            activity_years        - numb of years old activity can be
for qualifying patients in cohort
                          For example, "5" would indicate that anyone
with activity in the last 5 years
                          would be included in the patient cohort.
            report_year,
            report_month        - target month for reporting pcp activity
                          For example, year should be in format like "2007"
                          while month is number between 1 and 12 (Jan
and Dec)
    **/
            MSG_CNT            NUMBER                    :=    0;
                        RPT_DATE DATE :=
add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);

        BEGIN
               
            DBMS_OUTPUT.PUT_LINE( 'Begin PARTICIPATION search at ' ||
to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            insert into preavis.out(msg, insert_time) values ( 'Begin
PARTICIPATION search at ' || to_char(sysdate,'YYYY-MON-DD HH24:MM'),
sysDate );
            COMMIT;

            /**    Populate intial physician activity log for
patient-physician associations
            **/
                       
                        DELETE FROM PCP_ACTIVITY;
                        COMMIT;

            /** This institution list should be based on an
INSTITUTION_TREE.
                We don't want it to be hard-coded, but we also can't use
everything in INSTITUTION.
                We don't have permission to use everything.
            **/
            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 1 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 2 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 3 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 7 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 9 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 11 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 14 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 102 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 103 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 108 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 109 );

            PCP.RUN_ACTIVITY_LOAD( activity_years, report_year,
report_month, 110 );
                       
                            select count(*) into MSG_CNT from PCP_ACTIVITY;
            insert into preavis.out(msg, insert_time) values (
'a:'||MSG_CNT, sysDate );
            COMMIT;

            DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Find DEF references for activity physicians' );
            insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Find DEF references for
activity physicians', sysDate );
            COMMIT;

            /** Look for a match to a DEFining identifier for a
LOCAL_PROVIDER
            **/
            UPDATE PCP_ACTIVITY pal
            SET pal.participant_provider_id=

            (SELECT pi.person_id
             FROM person_identifier pi
             WHERE pal.participant_number=pi.identifier AND
pal.participant_number_sys_id=pi.identifier_sys_id
             AND pi.identifier_assignment_code='DEF'
            ) WHERE pal.participant_provider_id IS NULL
            AND pal.participant_global_id IS NULL
            AND pal.usable_yn='Y'
            AND pal.participant_number IS NOT NULL
            AND pal.participant_number_sys_id IS NOT NULL
            ;
            COMMIT;

            DBMS_OUTPUT.PUT_LINE(  to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Find REF physician references from PCP_ACTITIVTY' );
            insert into preavis.out(msg, insert_time) values ( 
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Find REF physician
references from PCP_ACTITIVTY', sysDate );
            COMMIT;

        /** Now we look for a match to a REFining identifier for a
LOCAL_PROVIDER.
         ** Note that we match to an arbitrary MIN(local_provider_id) in
this case.
        **/
        UPDATE    PCP_ACTIVITY pal
        SET    pal.participant_provider_id=
        (SELECT MIN(pi.person_id)
         FROM person_identifier pi
         WHERE pal.participant_number=pi.identifier AND
pal.participant_number_sys_id=pi.identifier_sys_id
         AND pi.identifier_assignment_code='REF'
        ) WHERE pal.participant_provider_id IS NULL
        AND   pal.participant_global_id IS NULL
        AND   pal.usable_yn='Y'
        AND   pal.participant_number IS NOT NULL
        AND   pal.participant_number_sys_id IS NOT NULL
        ;
        COMMIT;

        /**    Search for global_person_id for the physicians in the
activity log
        **/
        DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Find GLOBAL ID for activity physicians' );
        insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Find GLOBAL ID for
activity physicians', sysDate );
        COMMIT;

        PCP.PHYSICIAN_LOOKUP( report_year, report_month );
        COMMIT;

        /**    Mark the PARTICIPATIONS that occured during an INPATIENT
visit.
        **/
        DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Find PCP_ACTIVITY during INPATIENT admissions' );
        insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Find PCP_ACTIVITY during
INPATIENT admissions', sysDate );
        COMMIT;

        PCP.INPATIENT_LOOKUP( report_year, report_month );
        COMMIT;

        /**    Mark the PARTICIPATIONS between excluded doctor/patient
combinations.
        **/
        DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Find PCP_ACTIVITY between PCP_EXCLUSION
doctor/patient combinations' );
        insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Find PCP_ACTIVITY
between PCP_EXCLUSION doctor/patient combinations', sysDate );
        COMMIT;

        update pcp_activity a
        set usable_yn='N',not_usable_reason='pcp_exclusion'
        where exists
        (
            select 1
            from pcp_exclusion e
            inner join person p
            on e.patient_global_id=p.global_person_id
            where a.participation_time <= e.effective_date
            and a.participant_global_id=e.physician_global_id
            and a.institution_id=p.institution_id
            and a.patient_id=p.person_id
        )
        and usable_yn='Y';
        COMMIT;

        /**    Calculate PCP for each patient, based on activity.
            First, we calculate PCP based on just last 18 months of
activity.
            Next, we calculate PCP (for anyone not assinged a PCP in the
18 month pass)
            based on the entire 3 years worth of activity in PCP_ACTIVITY.
        **/
        DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Populate PCP_PROVIDER_SUMMARY from PCP_ACTIVITY' );
        insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Populate
PCP_PROVIDER_SUMMARY from PCP_ACTIVITY', sysDate );
        COMMIT;

        INSERT INTO PCP_PROVIDER_SUMMARY
        (evaluation_year,evaluation_month
       
,global_person_id,participant_global_id,participation_count,participation_day_count
       
,participation_time_latest,participation_time_earliest,provider_freq_rank)
        SELECT    report_year, report_month
            ,ppl.global_person_id,pal.participant_global_id
            ,count(pal.participation_time) as participation_count
            ,count(distinct pal.participation_time) as
participation_day_count
            ,max(pal.participation_time) as participation_time_latest
            ,min(pal.participation_time) as participation_time_earliest
            ,RANK() OVER
            (PARTITION BY ppl.global_person_id
             ORDER BY    count(distinct pal.participation_time) DESC
                    ,max(participation_time) DESC
                    ,min(participation_time)
                    ,count(pal.participation_time) DESC
            ) as rank
        FROM PCP_ACTIVITY pal
        INNER JOIN pcp_patient ppl ON
ppl.institution_id=pal.institution_id AND ppl.patient_id=pal.patient_id
            AND ppl.evaluation_year=report_year AND
ppl.evaluation_month=report_month
        WHERE pal.participant_global_id IS NOT NULL AND usable_yn='Y'
                        AND pal.participation_time >=
(RPT_DATE-((365*activity_years)/2))
            AND ppl.origin_code='PATIENT'
        GROUP BY ppl.global_person_id,pal.participant_global_id
        ;
            COMMIT;

        INSERT INTO PCP_PROVIDER_SUMMARY
        (evaluation_year,evaluation_month
       
,global_person_id,participant_global_id,participation_count,participation_day_count
       

,participation_time_latest,participation_time_earliest,provider_freq_rank)
        SELECT    report_year, report_month
            ,ppl.global_person_id,pal.participant_global_id
            ,count(pal.participation_time) as participation_count
            ,count(distinct pal.participation_time) as
participation_day_count
            ,max(pal.participation_time) as participation_time_latest
            ,min(pal.participation_time) as participation_time_earliest
            ,RANK() OVER
                (PARTITION BY ppl.global_person_id
                 ORDER BY    count(distinct pal.participation_time) DESC
                        ,max(participation_time) DESC
                        ,min(participation_time)
                        ,count(pal.participation_time) DESC
            ) as rank
        FROM PCP_ACTIVITY pal
        INNER JOIN pcp_patient ppl ON
ppl.institution_id=pal.institution_id AND ppl.patient_id=pal.patient_id
            AND ppl.evaluation_year=report_year AND
ppl.evaluation_month=report_month
        WHERE    pal.participant_global_id IS NOT NULL
                        AND pal.usable_yn='Y'
            AND pal.participation_time >= (RPT_DATE-((365*activity_years)))
            AND (pal.participation_time >= (RPT_DATE-((30))) and
pal.participation_time < RPT_DATE)
            AND ppl.global_person_id NOT IN
                (SELECT distinct(pps.global_person_id)
                 FROM pcp_provider_summary pps
                 WHERE pps.evaluation_year=report_year AND
pps.evaluation_month=report_month
                )
            AND ppl.origin_code='PATIENT'
        GROUP BY ppl.global_person_id,pal.participant_global_id
        ;
            COMMIT;

                            select count(*) into MSG_CNT from
PCP_PROVIDER_SUMMARY;
            insert into preavis.out(msg, insert_time) values (
'e:'||MSG_CNT, sysDate );
            COMMIT;

            DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- Mark PCPs as global physicians with most activity per
patient' );
            insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- Mark PCPs as global
physicians with most activity per patient', sysDate );
            COMMIT;

            /**    Find most commonly occuring physician for each patient
            **/
            UPDATE PCP_PROVIDER_SUMMARY pps
            SET pps.pcp_assigned_yn='Y'
            WHERE provider_freq_rank=1
            AND evaluation_year=report_year AND
evaluation_month=report_month;
            COMMIT;

            DBMS_OUTPUT.PUT_LINE( to_char(sysdate,'dd-mon-yyyy
HH24:MI:SS')||' -- END' );
            insert into preavis.out(msg, insert_time) values (
to_char(sysdate,'dd-mon-yyyy HH24:MI:SS')||' -- END', sysDate );
            COMMIT;
        END;


    PROCEDURE INPATIENT_LOOKUP(
                report_year    IN INT,
                report_month    IN INT
            ) IS

            TYPE time_list IS TABLE OF encounter_account.admit_time%TYPE
INDEX BY BINARY_INTEGER;

            ADMIT_TIMES        time_list;
            DISCHARGE_TIMES        time_list;
            TIME_LIST_CNT        NUMBER                    :=    0;

            UPD_CNT            NUMBER                    :=    0;
            I            NUMBER                    :=    0;
            CNT            NUMBER                    :=    0;
            LKP_CNT            NUMBER                    :=    0;

            GLOBAL_ID        PERSON.global_person_id%TYPE        :=    0;
            PREV_GLOBAL_ID        PERSON.global_person_id%TYPE       
:=    0;

            ACTIVITY_TIME        ENCOUNTER_ACCOUNT.admit_time%TYPE := NULL;

            PA_ROWID        UROWID;

            cursor pcp_act_read is
                SELECT       
pp.global_person_id,pa.participation_time,pa.rowid
                FROM        pcp_activity pa
                INNER JOIN    pcp_patient pp
                    ON pp.evaluation_year=report_year AND
pp.evaluation_month=report_month
                    AND pp.institution_id=pa.institution_id AND
pp.patient_id=pa.patient_id
                WHERE        pa.participant_global_id IS NOT NULL AND
pa.usable_yn='Y'
                                ORDER BY   
pp.global_person_id,pa.participation_time
                ;

        BEGIN

            dbms_output.put_line( '-- Starting PCP_INPATIENT procedure
'||to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            insert into preavis.out(msg, insert_time) values ( '--
Starting PCP_INPATIENT procedure '||to_char(sysdate,'YYYY-MON-DD
HH24:MM'), sysDate );
            COMMIT;

            OPEN    pcp_act_read;

            LOOP
                FETCH pcp_act_read INTO GLOBAL_ID, ACTIVITY_TIME, PA_ROWID;

                EXIT WHEN pcp_act_read%NOTFOUND;

                CNT := CNT + 1;

                IF GLOBAL_ID != PREV_GLOBAL_ID
                   THEN IF TIME_LIST_CNT > 0
                       THEN TIME_LIST_CNT := 0;
                        ADMIT_TIMES.DELETE;
                        DISCHARGE_TIMES.DELETE;
                    END IF;
                    SELECT   
trunc(e.admit_time),DECODE(e.discharge_time,NULL
                        ,trunc(e.admit_time+1),trunc(e.discharge_time+1))
                    BULK COLLECT
                    INTO    ADMIT_TIMES, DISCHARGE_TIMES
                    FROM    encounter_account e
                        INNER JOIN PCP_PATIENT a
                        ON a.evaluation_year=report_year AND
a.evaluation_month=report_month
                        AND a.global_person_id = GLOBAL_ID
                        AND a.institution_id=e.institution_id AND
a.patient_id=e.patient_id
                    WHERE    (e.care_setting_sys_id,e.care_setting_code) IN
                        (SELECT target_sys_id,target_code FROM concept_link
                             WHERE source_sys_id=334 and
source_code='INPATIENT_CARE_SETTINGS'
                             AND type_code='EL' and type_sys_id=4
                             GROUP BY target_sys_id,target_code
                        )
                                        AND e.admit_time IS NOT NULL
                    ;
                    TIME_LIST_CNT := ADMIT_TIMES.COUNT;
                    PREV_GLOBAL_ID := GLOBAL_ID;
                    LKP_CNT := LKP_CNT + 1;
                END IF;

                IF ACTIVITY_TIME IS NULL
                   THEN UPDATE pcp_activity
                        SET    usable_yn='N',not_usable_reason='time'
                        WHERE    rowid = PA_ROWID;
                    UPD_CNT := UPD_CNT + 1;
                   ELSE FOR I IN 1..TIME_LIST_CNT
                    LOOP
                        IF ACTIVITY_TIME >= ADMIT_TIMES(I) AND
ACTIVITY_TIME <= DISCHARGE_TIMES(I)
                           THEN UPDATE pcp_activity
                                SET usable_yn='N', not_usable_reason='inp'
                                WHERE    rowid = PA_ROWID;
                            UPD_CNT := UPD_CNT + 1;
                            EXIT;
                        END IF;
                    END LOOP;
                END IF;

            END LOOP;

            COMMIT;

            dbms_output.put_line( '-- Ending PCP_INPATIENT procedure
'||to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            dbms_output.put_line( '        Read CNT = '||CNT );
            dbms_output.put_line( '      Update CNT = '||UPD_CNT );
            dbms_output.put_line( 'Visit Lookup CNT = '||LKP_CNT );

            insert into preavis.out(msg, insert_time) values ( '--
Ending PCP_INPATIENT procedure '||to_char(sysdate,'YYYY-MON-DD
HH24:MM'), sysDate );
            insert into preavis.out(msg, insert_time) values ( '       
Read CNT = '||CNT, sysDate );
            insert into preavis.out(msg, insert_time) values ( '     
Update CNT = '||UPD_CNT, sysDate );
            insert into preavis.out(msg, insert_time) values ( 'Visit
Lookup CNT = '||LKP_CNT, sysDate );
            COMMIT;

            CLOSE pcp_act_read;
        END;


    PROCEDURE PHYSICIAN_LOOKUP(
                report_year    IN INT,
                report_month    IN INT
                ) IS

            UPD_CNT            NUMBER                    :=    0;
            I            NUMBER                    :=    0;
            LKP_CNT            NUMBER                    :=    0;

            TARGET_PROVIDER_ID   
LOCAL_PROVIDER.local_provider_id%TYPE    :=    0;
            PREV_PROVIDER_ID    LOCAL_PROVIDER.local_provider_id%TYPE   
:=    0;
            PCP_MIN_FLAG        LOCAL_PROVIDER.pcp_yn%TYPE;
            PCP_MAX_FLAG        LOCAL_PROVIDER.pcp_yn%TYPE;
            INST_ID            LOCAL_PROVIDER.institution_id%TYPE   
:=    0;

            PART_TIME        PCP_ACTIVITY.participation_time%TYPE;

            GLOB_ID            PERSON.global_person_id%TYPE        :=    0;
            PREV_GLOB_ID        PERSON.global_person_id%TYPE        :=    0;

            TYPE prov_id_list IS TABLE OF
PCP_ACTIVITY.participant_provider_id%TYPE INDEX BY BINARY_INTEGER;

            PROV_IDS        prov_id_list;
            PROV_IDS_CNT        NUMBER                    :=    0;

                        RPT_DATE DATE :=
add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);

            /**    This CURSOR is used to read the PCP_ACTIVITY rows
where we have identified a provider/doctor but have
                not yet connected it to the global physician occurences
or checked the validity of
                the doctor for being a PCP.
            **/
            CURSOR pcp_activity_read IS
                SELECT    participant_provider_id,participation_time
                FROM    pcp_activity
                WHERE    participant_global_id IS NULL
                    AND participant_provider_id IS NOT NULL
                    AND usable_yn='Y'
                ORDER BY participant_provider_id,participation_time
                FOR UPDATE
            ;

        BEGIN

            DBMS_OUTPUT.PUT_LINE( 'Beginning PCP_LOOKUP at ' ||
to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            insert into preavis.out(msg, insert_time) values (
'Beginning PCP_LOOKUP at ' || to_char(sysdate,'YYYY-MON-DD HH24:MM'),
sysDate );
            COMMIT;

            SELECT    pa.participant_provider_id
            BULK COLLECT
            INTO    PROV_IDS
            FROM    pcp_activity pa
            WHERE    pa.participant_provider_id IS NOT NULL
                AND pa.usable_yn='Y'
            GROUP BY pa.participant_provider_id
            ;

            PROV_IDS_CNT := PROV_IDS.COUNT;

            LKP_CNT := 0;
            UPD_CNT := 0;

            FOR I IN 1..PROV_IDS_CNT
            LOOP
                TARGET_PROVIDER_ID := PROV_IDS(I) ;

                BEGIN
                    SELECT    institution_id
                    INTO    INST_ID
                    FROM    local_provider
                    WHERE    local_provider_id = TARGET_PROVIDER_ID
                    ;

                    SELECT    p1.global_person_ID, MIN(lp1.pcp_yn),
MAX(lp1.pcp_yn)
                    INTO    GLOB_ID, PCP_MIN_FLAG, PCP_MAX_FLAG
                    FROM    LOCAL_PROVIDER lp
                    INNER JOIN PERSON per on
lp.institution_id=per.institution_id
                                AND lp.local_provider_id=per.person_id
                    INNER JOIN PERSON p1 ON
p1.global_person_id=per.global_person_id
                    LEFT OUTER JOIN LOCAL_PROVIDER lp1    ON
lp1.institution_id=p1.institution_id
                                    AND lp1.local_provider_id=p1.person_id
                    WHERE    lp.local_provider_id = TARGET_PROVIDER_ID
                            AND    (lp1.retirement_time IS NULL OR
lp1.retirement_time >= (RPT_DATE-365))
                    GROUP BY p1.global_person_id
                    ;
                                /** ELSE try to read the global provider
id value    **/
                EXCEPTION
                    WHEN NO_DATA_FOUND THEN GLOB_ID := 0;
                END;

                LKP_CNT := LKP_CNT + 1;

                IF GLOB_ID = 0                /** IF no global id
(should not happen)            **/
                   THEN UPDATE    pcp_activity
                        SET    usable_yn='N', not_usable_reason='id'
                        WHERE    participant_provider_id =
TARGET_PROVIDER_ID
                            AND participant_global_id IS NULL;
                                    /** THEN mark activity as unusable
in PCP calc        **/
                   ELSE IF PCP_MAX_FLAG='N' OR (PCP_MAX_FLAG='U' AND
PCP_MIN_FLAG='N')
                                    /**      ELSE IF physician cannot be
a PCP        **/
                       THEN UPDATE pcp_activity
                            SET    usable_yn='N',not_usable_reason='spec'
                            WHERE    participant_provider_id =
TARGET_PROVIDER_ID
                            AND participant_global_id IS NULL;
                                    /**           THEN mark activity
unusable for PCP calc    **/
                       ELSE UPDATE    pcp_activity
                            SET    participant_global_id = GLOB_ID
                            WHERE    participant_provider_id =
TARGET_PROVIDER_ID
                                AND participant_global_id IS NULL;
                                /**           ELSE save possible PCP's
global id    **/
                        INSERT INTO pcp_patient
                       
(evaluation_year,evaluation_month,institution_id,patient_id,global_person_id,origin_code)
                        VALUES
                       
(report_year,report_month,INST_ID,TARGET_PROVIDER_ID,GLOB_ID,'LOCAL_PROVIDER')
                        ;
                    END IF;
                END IF;

                UPD_CNT := UPD_CNT + 1;

            END LOOP;


            DBMS_OUTPUT.PUT_LINE( '  Physician_Lookup completed at:
'||to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            DBMS_OUTPUT.PUT_LINE( '  PCP Lookup Count: '||LKP_CNT );
            DBMS_OUTPUT.PUT_LINE( '  Update     Count: '||UPD_CNT );

            insert into preavis.out(msg, insert_time) values ( ' 
Physician_Lookup completed at: '||to_char(sysdate,'YYYY-MON-DD
HH24:MM'), sysDate );
            insert into preavis.out(msg, insert_time) values ( '  PCP
Lookup Count: '||LKP_CNT, sysDate );
            insert into preavis.out(msg, insert_time) values ( ' 
Update     Count: '||UPD_CNT, sysDate );
            COMMIT;

        END;

    PROCEDURE RUN_ACTIVITY_LOAD(
                activity_years IN INT,
                report_year IN INT,
                report_month IN INT,
                target_institution_id IN INT
            ) IS
    /**    Loads data from PARTICIPATION into PCP_ACTIVITY
        Parameters:
            activity_years        - numb of years old activity can be
for qualifying patients in cohort
                          For example, "5" would indicate that anyone
with activity in the last 5 years
                          would be included in the patient cohort.
            report_year,
            report_month        - target month for reporting pcp activity
                          For example, year should be in format like "2007"
                          while month is number between 1 and 12 (Jan
and Dec)
            target_institution_id    - institution to copy
    **/
                        RPT_DATE DATE :=
add_months(to_date(concat(concat(concat(to_char(report_year), '/'),
to_char(report_month)), '/01'), 'yyyy/mm/dd'), 1);

        BEGIN
               
            DBMS_OUTPUT.PUT_LINE( 'Begin ' || target_institution_id || '
PARTICIPATION search at ' || to_char(sysdate,'YYYY-MON-DD HH24:MM') );
            insert into preavis.out(msg, insert_time) values ( 'Begin '
|| target_institution_id || ' PARTICIPATION search at ' ||
to_char(sysdate,'YYYY-MON-DD HH24:MM'), sysDate );
            COMMIT;

            /**    Populate intial physician activity log for
patient-physician associations
            **/
                       
            INSERT
            INTO PCP_ACTIVITY(institution_id,patient_id
               
,participant_provider_id,participant_number,participant_number_sys_id,participation_time
                ,usable_yn,act_type_code)
            SELECT    /*+ full(ppl) full(p) */
                ppl.institution_id,ppl.patient_id
               
,p.participant_provider_id,p.participant_number,p.participant_number_sys_id
                ,event_time,'Y','E'
            FROM PARTICIPATION p
            INNER JOIN PCP_PATIENT ppl
                ON ppl.institution_id=p.institution_id AND
ppl.patient_id=p.patient_id
                AND ppl.evaluation_year=report_year AND
ppl.evaluation_month=report_month
            WHERE (p.participation_type_sys_id,p.participation_type_code) IN
                (SELECT target_sys_id,target_code FROM concept_link
                 WHERE source_sys_id=334 and
source_code='PARTICIPATION_TYPES'
                 AND type_code='EL' and type_sys_id=4
                 GROUP BY target_sys_id,target_code)
            AND ( (p.participant_provider_id IS NOT NULL)
                OR
                  (p.participant_number IS NOT NULL AND
p.participant_number<>'99999' AND p.participant_number_sys_id IS NOT NULL)
                )
            AND (p.event_time >= (RPT_DATE-(365*activity_years)))
            AND p.institution_id=target_institution_id
            ;
            COMMIT;
        END;

END PCP;