Skip to content

Changes Import from Climsoft Functionalities#9070

Merged
N-thony merged 14 commits into
IDEMSInternational:masterfrom
Patowhiz:climsoft_updated_dialog
Aug 8, 2024
Merged

Changes Import from Climsoft Functionalities#9070
N-thony merged 14 commits into
IDEMSInternational:masterfrom
Patowhiz:climsoft_updated_dialog

Conversation

@Patowhiz
Copy link
Copy Markdown
Contributor

Fixes #7583
Replaces PR #8229

@rdstern @jkmusyoka this is ready for review.

Note that I have made significant changes to the dialog design. I'm happy to walk you through the dialog features if the design does not clearly reflect their intended use.

Additionally, some Climsoft users have requested a search functionality in the selector, stating that it can be difficult to scroll through and select the needed stations when there are many.

I propose we remove the "Import From Climsoft Wizard" menu as it is no longer needed.

@rdstern, I also suggest we remove the default date column, as it can be misleading when the data is not daily.

This PR was also meant to complement Climsoft desktop PR #733, which includes a dedicated R-Instat dialog for importing QC data into observation initial. Once this is merged, I can demonstrate to both the R-Instat and Climsoft teams how Climsoft desktop data can be QC'd in R-Instat and then sent back to the database.

Below is a screenshot of the new dialog.
1

@rdstern
Copy link
Copy Markdown
Collaborator

rdstern commented Jul 14, 2024

@Patowhiz I am testing using the Malawi database. It is so nice to have this. I am going to send some data (for Zambia) that could go into Initial.
a) The default on Select Data Table is now Initial. I suggest it shoulf be Final. (We accept that data can remain in Initial, but don't want to encourage it!
b) I started by importing (all) the metadata and it gave me an error message, see below. It all worked though and the 3 metadata data frames were imported.
image

c) I really like it that the metadata is so prominent now. @jkmusyoka we can be checking and also improving the metadata, and then writing the improved metadata back (as well, I hope)? At least the station metadata. So we should include the shape files for Malawi and Zambia (I use gadm - is there better) in the workshop. Then we can add the administrative districts, and also (maybe?) write them back to climsoft. We should check our export dialog - how well does it work for the metadata? We must do this well before the Zambia workshop. For Malawi we could also add a variable giving the type of station. There are 96 stations there, so I suggest a Variable called Rainfall for all rows. Then change it for those that are Main stations - or whatever they want to call them. We have a special right-click feature to edit a single value - which is fine for this task!

d) I now returned to the Observation data and there are no no stations, or elements?

image

I had to press Reset and then (of course) connect to the database again.

e) For information, if you go into the data selector and type kas you get kasungu straight away. Same for nkh.

f) Could you change Select Data Date to just Select Dates and also More options to Options

g) You have lost the most important table to import, namely the unstacked data. I am happy with all the options you have, but including the unstacked data should be the default. I think earlier, there was a checkbox for that. If so, then it should be checked by default.

h) In the elements (once I have selected the stations) they now all seem to be there. I thought earlier it just gave the elements that had data at (at least one of) the stations?

i) In the options there is a lot to untick, and I suggest it is rare that anyone will want all those. Also it isn't clear that there are other variables that are there anyway. Also, for daily data could the date-time variable be an option. It is confusing. I suggest all the columns to be exported into the long form (which we didn't use by the way!) should be present, and those that are always given are ticked and disabled - so you know what will be there. Then most of the others are unticked, by default.

j) Finally @Patowhiz you suggested, in the menu, that we delete the Wizard option. Here is the current menu:

image

If we delete that option, then I suggest we also delete the separator line above the Import from Climsoft. Then there is just one separator, between Import and Export.

I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both?

@Patowhiz
Copy link
Copy Markdown
Contributor Author

@rdstern in regards to:

Item (a), I set initial as the default because the idea was for to be consistent with Climsoft version 4 data flow. I envisaged Climsoft users to first visualize and QC what is in initial table before they move the data to final, from which they can use to produce products. Nevertheless, given your feedback and considering how initial will be irrelevant in the near future, I've set the default to final.

Item (b), the error message that you get when importing metadata is due the internal conversion of imported numeric columns to factor. In particular, when converting station_operational column to factor. @N-thony there is a serious bug when you convert a numeric column that has missing values only to factor. You can reproduce this by adding a new column with missing values only, then converting to numeric then to factor. Line 1501 of the data_object_R6.R file produces an empty integer that is added as a metadata attribute. I noticed there is a comment that the addition is a temporary fix to issue of add_columns not retaining attributes of new columns. I couldn't get the PR under which the temporary fix was added, so I'm not sure if putting a validation check to it would be regressing other areas. Could you check on how to fix this bug? @rdstern once the bug is fixed, it will also be solved here (note it's a general bug).

Item (d), could you retest it, I couldn't reproduce it on my machine.

Item (f), I was trying to distinguish the observation date from the entry date (which surprisingly doesn't exist in the 2 tables). I've now set it to "Select Dates" as suggested. I also preferred "More Options" over "Options" to indicate to users that selections done in the sub dialog are in addition to the main dialog selections. I've changed it to "Options" as suggested.

Item (g), I think I removed unstack data because of performance and UX reasons. It can take quite a long time for large data sets. I've now returned it. Do you still prefer this option to be checked by default? Note, when check records is clicked, the rows showed are of the stacked data.

Item(h), we both found that to be performance wise inefficient when we conducting training. Querying distinct elements on those 2 tables takes a lot of time and the UX becomes poor because users have to wait every time they reload the dialog or click on the elements receiver. That's why I decided to just displaying all the elements. Note also, there is no harm in selecting all the elements, the observation data returned will just be for the found elements. Do you still want me to revert to the feature of showing the elements found in the 2 tables only?

Item(i), agreed. Should we remove the conversion to date entirely, the make date dialog can be used to convert it to date? I think the ideal thing to do is to convert the datetime column into date time R equivalent, do we currently support date time R columns? I can include a checkbox that says convert date time to date, would you prefer that? I have temporarily removed this feature and I highly recommend we not add feature to not confuse users.

item(k) I suggest we remove the separator after this PR is merged, to avoid conflicts on frmMain.

I don't quite understand what you mean by "I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both?"

Last but not least, I'm hoping Climsoft users will get used to Selecting the dates(to filter data) and clicking on the Check Records button(intentionally added) first before clicking on Ok, failure to which they may have to learn how to kill R-Instat using the task manager.

@rdstern
Copy link
Copy Markdown
Collaborator

rdstern commented Jul 23, 2024

@Patowhiz many thanks for the updates and detailed reply. @jkmusyoka can we both be testing this version together?
@Patowhiz you have written separately that you have updated the data in the Malawi (online) climsoft database and also checked to Zambia one. Do either yet have any data in the Initial table to test that aspect? And I assume if we just enter a month or two of data, then that's where it will go - so we could do that? James, don't you have some data from the Zambia volunteer stations from Peter?

@Patowhiz you write: I don't quite understand what you mean by "I note also that Climsoft is currently the last of the importing and the first of the exporting options. Do we keep it like that, or move it to the top, so it is the first of both?"

Here is the current Climatic > File menu:

image

And you were proposing the wizard be deleted, so that will currently leave Climsoft Import a bit lonely, surrounded by 2 horizontal lines. I was assuming a bit more needs to be changed?

@Patowhiz
Copy link
Copy Markdown
Contributor Author

Patowhiz commented Jul 23, 2024

@rdstern

As I have mentioned above, to avoid frmMain conflicts, I'll remove the separators in the menu in another PR.

The Malawi and Zambia database don't have data in the initial. I didn't want to 'pollute' them with data that is not theirs. I'm happy to add if it is supplied. You can add it as well using the current Climsoft version 4 if you want.

If you want to play around with a database that we can always discard the changes, I recommend using mariadb_climsoft_db_v4 database. I have created it on the cloud version. You can use it to enter any data or metadata using current Climsoft version 4 release. That can include the imports that you have mentioned.

Like I have mentioned to you before, any data imported will go to initial and you can correct it from there as well.

Feel free to use other databases as well if you think that is more sensible. After which you can let me know when to 'reset' them.

@Patowhiz
Copy link
Copy Markdown
Contributor Author

@rdstern @jkmusyoka have you managed to review this PR.
@N-thony kindly review this PR if you have time.

Copy link
Copy Markdown
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Patowhiz I give my impressions, together with problems and suggestions:
a) I had previously used the Malawi cloud database. I found on connecting this time, that it had remembered that connection. Very nice. I only had to remember rthe password.
b) I started by importing the metadata. I still get this error message?

image

c) Then I returned to the other button and was surprised to see the Element numbers in the data selector. It had moved from the top to the bottom receiver. Could it remain at the top one, with the stations?

d) Thanks for adding the unstacked back in. I found the order of the elements in the unstacked data to be odd. It isn't a problem with just 3, but couldn't we have the same order as the IDs?

e) Then, in the stacked data it is nice that the order of the factor levels is (I think) the same as the ID order. But @jkmusyoka couldn't we take advantage of the IDs neing positive integers to import them as levels 2, 3 and 5? That would mean that whether we specify them as ID or name, we always import both?

image

f) The data come in as just a date_time variable. @Patowhiz we do have date-time variables in R-Instat, so please could it come - at least as a date-time (D.T) variable when imported?

image

g) In the above, I also added a date variable. Please could you add the date variable on import. When it is just daily data perhaps you also detect this and omit the date time, but I would also be happy if you always gave both? (Having both the date and the date-time will be fine for hourly or 10 minute data.)

h) The Station is a character variable. Please could it become a factor.

i) I then tried importing from Initial! I know there is nothing there for those station/elements, but it should tell me and not give an error, see below:

image

I have not checked other options yet, but this will do for now. This is going to be really nice! (Then to check the export too! I am so happy we are doing this now and not during the Zambia workshop!)

@Patowhiz Patowhiz mentioned this pull request Jul 30, 2024
@rdstern
Copy link
Copy Markdown
Collaborator

rdstern commented Aug 5, 2024

@Patowhiz is this now ready for testing again?

@Patowhiz
Copy link
Copy Markdown
Contributor Author

Patowhiz commented Aug 6, 2024

@rdstern @jkmusyoka This is now ready for review. I have made all the agreed changes.

When explaining to users, please emphasise the importance of using the date filter when importing data from multiple stations. Failing to do so might result in R-Instat hanging, requiring users to know how to terminate it.

@Patowhiz
Copy link
Copy Markdown
Contributor Author

Patowhiz commented Aug 6, 2024

@rdstern @N-thony

I have fixed the metadata import issue (as explained in this comment) by not using the data book R functions for converting to factor. I still recommend @N-thony to use my branch climsoft_update_dialog_metadata_bug to check how the error thrown in line 1501 of data_object_R6.R could be avoided.

This PR can now be merged regardless of whether issue #9086 is fixed or not.

Copy link
Copy Markdown
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Patowhiz this is looking very nice. @N-thony over to you to check and merge. I really hope this can be in the forthcoming version.

@Patowhiz later we need to change the Climatic > File menu. Also please could we discuss whether we could add some "inventory" options. For example I'd like perhaps to import some sunshine data. How can I check which stations have any sunshine records?

@Patowhiz
Copy link
Copy Markdown
Contributor Author

Patowhiz commented Aug 7, 2024

@rdstern

If you want to check if a station has sunshine records, just select the station and then select the sunshine element then click on check records button. The dialog will show you the number of records present. Easy pizzy.

@N-thony N-thony merged commit e72800b into IDEMSInternational:master Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New feature requests on the climsoft import dialog

4 participants