diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/LICENSE b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..9f6ace032ef12834032016395deee26f252c3fa2 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/LICENSE @@ -0,0 +1,284 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------ +Files: third_party/compute_library/... + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +------------------ +Files: ACKNOWLEDGEMENTS +LICENSE + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND + ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR + ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------ +Files: third_party/hexagon + +Copyright (c) 2016-2019, The Linux Foundation. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted (subject to the limitations in the +disclaimer below) provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of The Linux Foundation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE +GRANTED BY THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT +HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER +IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR +OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN +IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/README.md b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/README.md new file mode 100644 index 0000000000000000000000000000000000000000..bbf94c74a5e8dc3944e3803aa05d107606e6d3ac --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/README.md @@ -0,0 +1,115 @@ +# Graph Convolutional Networks for Relational Graphs + +This repository contains the training (in Tensorflow 1.15) of a relational graph convolutional network on different computing architectures. + +## References + +Keras-based implementation of Relational Graph Convolutional Networks for semi-supervised node classification on (directed) relational graphs. +Our training code is an adaption of the implementation of the [R-GCN paper](https://arxiv.org/abs/1703.06103) to support Tensorflow 1.15. + +The original implementation by the author is available at: + - Kipf, T. N. (2017). Github - tkipf/relational-gcn: Graph Convolutional Networks for relational graphs. https://github.com/tkipf/relational-gcn + +For reproduction of the *entity classification* results in Schlichtkrull et al. [Modeling Relational Data with Graph Convolutional Networks](https://arxiv.org/abs/1703.06103) (2017) [1], see instructions below. + +The code for the *link prediction* task in [1] can be found in the following repository: https://github.com/MichSchli/RelationPrediction + +## Dependencies + + * Tensorflow (1.15) + * numpy + * pandas + * scipy + * rdflib + +Further version dependencies are contained in the requirements.txt. + +## Usage + +To replicate the experiments from Schlichtkrull et al. [1], first run (for AIFB and cpu): + +``` +python3.7 datasets/prepare_dataset.py -d aifb +``` +or +``` +bash scripts/datasets/prepare_aifb.sh +``` + +Afterwards, copy the aifb.pickle to rgcn so that your structure looks like this: + +``` +├── datasets +│ ├── data +│ │ ├── aifb +│ │ │ ├── aifb_stripped.nt.gz +│ │ │ ├── output +│ │ │ │ ├── nodes.pkl +│ │ │ │ ├── rel_dict.pkl +│ │ │ │ ├── test_idx.npy +│ │ │ │ ├── test_names.npy +│ │ │ │ ├── train_idx.npy +│ │ │ │ └── train_names.npy +│ │ │ ├── README.txt +│ │ │ └── strip_targets.py +│ ├── aifb.pickle <----------- copy -----------> +│ ├── data_utils.py +│ ├── __init__.py +│ ├── prepare_dataset.py +│ └── utils.py +├── rgcn +│ ├── aifb.pickle <----------- copy -----------> +│ ├── hyperparameters.py +│ ├── __init__.py +│ ├── layers +│ │ ├── graph.py +│ │ ├── __init__.py +│ │ ├── input.py +│ ├── metrics.py +│ ├── mutag.pickle +│ ├── train.py +│ └── utils.py +``` + +And train the model with: + +``` +python3.7 rgcn/train.py -d aifb -de cpu --bases 0 --hidden 20 --l2norm 0.0 --testing +``` +or +``` +bash scripts/train_aifb_cpu.sh +``` + +For the MUTAG dataset, run: + +``` +python3.7 datasets/prepare_dataset.py -d mutag +python3.7 rgcn/train.py -d mutag -de cpu --bases 30 --hidden 16 --l2norm 5e-4 --testing +``` + +For BGS, run: + +``` +python3.7 datasets/prepare_dataset.py -d bgs +python3.7 rgcn/train.py -d bgs -de cpu --bases 40 --hidden 16 --l2norm 5e-4 --testing +``` + +For AM, run: + +``` +python3.7 datasets/prepare_dataset.py -d am +python3.7 rgcn/train.py -d am -de cpu --bases 40 --hidden 10 --l2norm 5e-4 --testing +``` + +Note: Results depend on random seed and will vary between re-runs. + + +## References + +[1] M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, M. Welling, [Modeling Relational Data with Graph Convolutional Networks](https://arxiv.org/abs/1703.06103), 2017 + + +## Performance results on CPU, GPU, NPU + +Can be found in [documents](documents/main.md). diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/README.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/README.txt new file mode 100644 index 0000000000000000000000000000000000000000..b1247e43d3cda64dcba5fd64173871ed657368c5 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/README.txt @@ -0,0 +1,33 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +1. Description: The AIFB dataset describes the AIFB research institute in terms of its staff, research group, and publications. In (Bloehdorn et al 2007) the dataset was first used to predict the affiliation (i.e., research group) for people in the dataset. The dataset contains 178 members of a research group, however the smallest group contains only 4 people, so this one is removed from the dataset, leaving 4 classes. Moreover, we also remove the \texttt{employs relation, which is the inverse of the affiliation relation from the dataset. + +2. ML taks: classification + +3. Number of instances: 176 + +4. Original source: AIFB + +5. Linked to: AIFB + +6. Target variables + -"label_affiliation" (classification) + + +7. Stratified data split (training/test): + -label_affiliation: TrainingSet.tsv (80%) and TestSet.tsv (20%) + diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/strip_targets.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/strip_targets.py new file mode 100644 index 0000000000000000000000000000000000000000..b61be47f7e2f19e6354661d3060e23e5658f87bc --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/aifb/strip_targets.py @@ -0,0 +1,35 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +import rdflib as rdf +import gzip + +g = rdf.Graph() + +g.parse('./aifb_fixed_complete.n3', format='n3') + +employs = rdf.term.URIRef("http://swrc.ontoware.org/ontology#employs") +affiliation = rdf.term.URIRef("http://swrc.ontoware.org/ontology#affiliation") + +rels = set(g.predicates()) + +g.remove((None, employs, None)) +g.remove((None, affiliation, None)) + +with gzip.open('aifb_stripped.nt.gz', 'wb') as output: + g.serialize(output, format='nt') + +g.close() \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/am/README.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/am/README.txt new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/am/strip_targets.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/am/strip_targets.py new file mode 100644 index 0000000000000000000000000000000000000000..2d5129826eb36371106d216c686866de7542f93b --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/am/strip_targets.py @@ -0,0 +1,34 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +import rdflib as rdf +import gzip + +g = rdf.Graph() + +with gzip.open('am-combined.nt.gz', 'rb') as _input: + g.parse(_input, format='nt') + +rel = rdf.term.URIRef("http://purl.org/collections/nl/am/objectCategory") +g.remove((None, rel, None)) + +rel = rdf.term.URIRef("http://purl.org/collections/nl/am/material") +g.remove((None, rel, None)) + +with gzip.open('am_stripped.nt.gz', 'wb') as output: + g.serialize(output, format='nt') + +g.close() \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/bgs/README.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/bgs/README.txt new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/bgs/strip_targets.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/bgs/strip_targets.py new file mode 100644 index 0000000000000000000000000000000000000000..1258db4833f76c75cef482a36d36e9a36726f099 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/bgs/strip_targets.py @@ -0,0 +1,35 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +import rdflib as rdf +import gzip + +g = rdf.Graph() + +g.parse('./completeDataset.nt', format='nt') + +lith = rdf.term.URIRef("http://data.bgs.ac.uk/ref/Lexicon/hasLithogenesis") + +for s, p, o in g.triples((None, lith, None)): + print s, p, o + + +g.remove((None, lith, None)) + +with gzip.open('bgs_stripped.nt.gz', 'wb') as output: + g.serialize(output, format='nt') + +g.close() \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/README.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/README.txt new file mode 100644 index 0000000000000000000000000000000000000000..452fa84cdad6553798c1e1ef0cd1b1f5b9026ac4 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/README.txt @@ -0,0 +1,33 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +1. Description: The MUTAG dataset is distributed as an example dataset for the DL-Learner toolkit (http://dl-learner.org). It contains information about complex molecules that are potentially carcinogenic, which is given by the \texttt{isMutagenic property. + +2. ML taks: classificaiton + +3. Number of instances: 340 + +4. Original source: MUTAG + +5. Linked to: MUTAG + +6. Target variables + -"label_mutagenic" (classification) + + +7. Stratified data split (training/test): + -label_mutagenic: TrainingSet.tsv (80%) and TestSet.tsv (20%) + diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/strip_targets.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/strip_targets.py new file mode 100644 index 0000000000000000000000000000000000000000..fe8915fa79dffa3db4bed3a129e346413ec49817 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data/mutag/strip_targets.py @@ -0,0 +1,31 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +import rdflib as rdf +import gzip + +g = rdf.Graph() + +g.parse('./carcinogenesis.owl', format='xml') + +is_mutagenic = rdf.term.URIRef("http://dl-learner.org/carcinogenesis#isMutagenic") + +g.remove((None, is_mutagenic, None)) + +with gzip.open('mutag_stripped.nt.gz', 'wb') as output: + g.serialize(output, format='nt') + +g.close() diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data_utils.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data_utils.py new file mode 100644 index 0000000000000000000000000000000000000000..0fabbfa3878b851b2e2f631fb4db184db5a5a10d --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/data_utils.py @@ -0,0 +1,391 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Utils for dataset changes. """ + +from __future__ import print_function + +import os +import re +import sys +import gzip +import glob +import pickle as pkl +from collections import Counter + +import wget # pylint: disable=import-error +import rdflib as rdf # pylint: disable=import-error +import pandas as pd # pylint: disable=import-error +import numpy as np # pylint: disable=import-error +import scipy.sparse as sp # pylint: disable=import-error + +np.random.seed(42) + + +class RDFReader: + + """ RDFReader class for rdf files """ + + __graph = None + __freq = {} + + def __init__(self, file): + + self.__graph = rdf.Graph() + + if file.endswith('nt.gz'): + with gzip.open(file, 'rb') as f: # pylint: disable=invalid-name + self.__graph.parse(file=f, format='nt') + else: + self.__graph.parse(file, format=rdf.util.guess_format(file)) + + # See http://rdflib.readthedocs.io for the rdflib documentation + + self.__freq = Counter(self.__graph.predicates()) + + print('Graph loaded, frequencies counted.') + + + def triples(self, relation=None): + """ Yield triples of the relations """ + for s, p, o in self.__graph.triples((None, relation, None)): # pylint: disable=invalid-name + yield s, p, o + + + def __enter__(self): + return self + + + def __exit__(self, exc_type, exc_value, traceback): + self.__graph.destroy('store') + self.__graph.close(True) + + + def subject_set(self): + """ Get a set of subjects """ + return set(self.__graph.subjects()) + + + def object_set(self): + """ Get a set of objects """ + return set(self.__graph.objects()) + + + def relation_list(self): + """ + Returns a list of relations, ordered descending by frequenecy + :return: + """ + res = list(set(self.__graph.predicates())) + res.sort(key=lambda rel: - self.freq(rel)) + return res + + def __len__(self): + return len(self.__graph) + + def freq(self, relation): + """ + The frequency of this relation (how many distinct triples does it occur in?) + :param relation: + :return: + """ + if relation not in self.__freq: + return 0 + return self.__freq[relation] + + +def load_sparse_csr(filename): + """ Load a sparse matrix from file """ + loader = np.load(filename) + return sp.csr_matrix((loader['data'], loader['indices'], loader['indptr']), + shape=loader['shape'], dtype=np.float32) + + +def save_sparse_csr(filename, array): + """ Save a sparse matrix to file """ + np.savez(filename, data=array.data, indices=array.indices, + indptr=array.indptr, shape=array.shape) + + +def load_data(dataset_str='aifb', limit=-1): # pylint: disable=too-many-branches,too-many-statements + """ + :param dataset_str: + :param rel_layers: + :param limit: If > 0, will only load this many adj. matrices + All adjacencies are preloaded and saved to disk, + but only a limited a then restored to memory. + + :return: + """ + + print('Loading dataset', dataset_str) + + dirname = os.path.dirname(os.path.realpath(sys.argv[0])) + + if dataset_str == 'am': + data_url = 'https://www.dropbox.com/s/htisydfgwxmrx65/am_stripped.nt.gz?dl=1' + graph_file = 'data/am/am_stripped.nt.gz' + + if not os.path.isfile(graph_file): + print('Downloading AM data.') + wget.download(data_url, graph_file) + + task_file = 'data/am/completeDataset.tsv' + train_file = 'data/am/trainingSet.tsv' + test_file = 'data/am/testSet.tsv' + label_header = 'label_cateogory' + nodes_header = 'proxy' + + elif dataset_str == 'aifb': + data_url = 'https://www.dropbox.com/s/fkvgvkygo2gf28k/aifb_stripped.nt.gz?dl=1' + # The RDF file containing the knowledge graph + graph_file = 'data/aifb/aifb_stripped.nt.gz' + if not os.path.isfile(graph_file): + print('Downloading AIFB data.') + wget.download(data_url, graph_file) + + # The TSV file containing the classification task + task_file = 'data/aifb/completeDataset.tsv' + # The TSV file containing training indices + train_file = 'data/aifb/trainingSet.tsv' + # The TSV file containing test indices + test_file = 'data/aifb/testSet.tsv' + label_header = 'label_affiliation' + nodes_header = 'person' + + elif dataset_str == 'mutag': + data_url = 'https://www.dropbox.com/s/qy8j3p8eacvm4ir/mutag_stripped.nt.gz?dl=1' + graph_file = 'data/mutag/mutag_stripped.nt.gz' + if not os.path.isfile(graph_file): + print('Downloading MUTAG data.') + wget.download(data_url, graph_file) + task_file = 'data/mutag/completeDataset.tsv' + train_file = 'data/mutag/trainingSet.tsv' + test_file = 'data/mutag/testSet.tsv' + label_header = 'label_mutagenic' + nodes_header = 'bond' + + elif dataset_str == 'bgs': + data_url = 'https://www.dropbox.com/s/uqi0k9jd56j02gh/bgs_stripped.nt.gz?dl=1' + graph_file = 'data/bgs/bgs_stripped.nt.gz' + if not os.path.isfile(graph_file): + print('Downloading BGS data.') + wget.download(data_url, graph_file) + task_file = 'data/bgs/completeDataset_lith.tsv' + train_file = 'data/bgs/trainingSet(lith).tsv' + test_file = 'data/bgs/testSet(lith).tsv' + label_header = 'label_lithogenesis' + nodes_header = 'rock' + + else: + raise NameError('Dataset name not recognized: ' + dataset_str) + + os.makedirs('data/' + dataset_str + '/output', exist_ok=True) + + adj_fprepend = 'data/' + dataset_str + '/output/adjacencies_' + labels_file = 'data/' + dataset_str + '/output/labels.npz' + train_idx_file = 'data/' + dataset_str + '/output/train_idx.npy' + test_idx_file = 'data/' + dataset_str + '/output/test_idx.npy' + train_names_file = 'data/' + dataset_str + '/output/train_names.npy' + test_names_file = 'data/' + dataset_str + '/output/test_names.npy' + rel_dict_file = 'data/' + dataset_str + '/output/rel_dict.pkl' + nodes_file = 'data/' + dataset_str + '/output/nodes.pkl' + + graph_file = dirname + '/' + graph_file + task_file = dirname + '/' + task_file + train_file = dirname + '/' + train_file + test_file = dirname + '/' + test_file + adj_fprepend = dirname + '/' + adj_fprepend + labels_file = dirname + '/' + labels_file + train_idx_file = dirname + '/' + train_idx_file + test_idx_file = dirname + '/' + test_idx_file + train_names_file = dirname + '/' + train_names_file + test_names_file = dirname + '/' + test_names_file + rel_dict_file = dirname + '/' + rel_dict_file + nodes_file = dirname + '/' + nodes_file + + adj_files = glob.glob(adj_fprepend + '*.npz') + + if adj_files and os.path.isfile(labels_file) and \ + os.path.isfile(train_idx_file) and os.path.isfile(test_idx_file): + + # load precomputed adjacency matrix and labels + + adj_files.sort(key=lambda f: int(re.search('adjacencies_(.+?).npz', f).group(1))) + + if limit > 0: + adj_files = adj_files[:limit * 2] + + adjacencies = [load_sparse_csr(file) for file in adj_files] + adj_shape = adjacencies[0].shape + + print('Number of nodes: ', adj_shape[0]) + print('Number of relations: ', len(adjacencies)) + + labels = load_sparse_csr(labels_file) + labeled_nodes_idx = list(labels.nonzero()[0]) + + print('Number of classes: ', labels.shape[1]) + + train_idx = np.load(train_idx_file) + test_idx = np.load(test_idx_file) + train_names = np.load(train_names_file) + test_names = np.load(test_names_file) + + with open(rel_dict_file, 'rb') as fp: # pylint: disable=invalid-name + relations_dict = pkl.load(fp) + + else: + + # loading labels of nodes + labels_df = pd.read_csv(task_file, sep='\t', encoding='utf-8') + labels_train_df = pd.read_csv(train_file, sep='\t', encoding='utf8') + labels_test_df = pd.read_csv(test_file, sep='\t', encoding='utf8') + + with RDFReader(graph_file) as reader: + + relations = reader.relation_list() + subjects = reader.subject_set() + objects = reader.object_set() + + print([(rel, reader.freq(rel)) for rel in relations[:limit]]) + + nodes = list(subjects.union(objects)) + adj_shape = (len(nodes), len(nodes)) + + print('Number of nodes: ', len(nodes)) + print('Number of relations in the data: ', len(relations)) + + relations_dict = {rel: i for i, rel in enumerate(list(relations))} + nodes_dict = {node: i for i, node in enumerate(nodes)} + + assert len(nodes_dict) < np.iinfo(np.int32).max + + adjacencies = [] + + for i, rel in enumerate( + relations if limit < 0 else relations[:limit]): + + print(f'Creating adjacency matrix for rel {i}: {rel}, freq {reader.freq(rel)}') + edges = np.empty((reader.freq(rel), 2), dtype=np.int32) + + size = 0 + for j, (s, _, o) in enumerate(reader.triples(relation=rel)): # pylint: disable=invalid-name + if nodes_dict[s] > len(nodes) or nodes_dict[o] > len(nodes): + print(s, o, nodes_dict[s], nodes_dict[o]) + + edges[j] = np.array([nodes_dict[s], nodes_dict[o]]) + size += 1 + + print(f'{size} edges added') + + row, col = np.transpose(edges) + + data = np.ones(len(row), dtype=np.int8) + + adj = sp.csr_matrix((data, (row, col)), shape=adj_shape, dtype=np.int8) + + adj_transp = sp.csr_matrix((data, (col, row)), shape=adj_shape, dtype=np.int8) + + save_sparse_csr(adj_fprepend + f'{(i * 2)}.npz', adj) + save_sparse_csr(adj_fprepend + f'{(i * 2 + 1)}.npz', adj_transp) + + if limit < 0: + adjacencies.append(adj) + adjacencies.append(adj_transp) + + # Reload the adjacency matrices from disk + if limit > 0: + adj_files = glob.glob(adj_fprepend + '*.npz') + adj_files.sort(key=lambda f: int( + re.search('adjacencies_(.+?).npz', f).group(1))) + + adj_files = adj_files[:limit * 2] + for i, file in enumerate(adj_files): + adjacencies.append(load_sparse_csr(file)) + print(f'{i} adjacency matrices loaded ') + + nodes_u_dict = {np.unicode(key): val for key, val in nodes_dict.items()} + + labels_set = set(labels_df[label_header].values.tolist()) + labels_dict = {lab: i for i, lab in enumerate(list(labels_set))} + + print(f'{len(labels_set)} classes: {labels_set}') + + labels = sp.lil_matrix((adj_shape[0], len(labels_set))) + labeled_nodes_idx = [] + + print('Loading training set') + + train_idx = [] + train_names = [] + for nod, lab in zip(labels_train_df[nodes_header].values, + labels_train_df[label_header].values): + nod = np.unicode(nod) # type: unicode + if nod in nodes_u_dict: + labeled_nodes_idx.append(nodes_u_dict[nod]) + label_idx = labels_dict[lab] + labels[labeled_nodes_idx[-1], label_idx] = 1 + train_idx.append(nodes_u_dict[nod]) + train_names.append(nod) + else: + print(u'Node not in dictionary, skipped: ', + nod.encode('utf-8', errors='replace')) + + print('Loading test set') + + test_idx = [] + test_names = [] + for nod, lab in zip(labels_test_df[nodes_header].values, + labels_test_df[label_header].values): + nod = np.unicode(nod) + if nod in nodes_u_dict: + labeled_nodes_idx.append(nodes_u_dict[nod]) + label_idx = labels_dict[lab] + labels[labeled_nodes_idx[-1], label_idx] = 1 + test_idx.append(nodes_u_dict[nod]) + test_names.append(nod) + else: + print(u'Node not in dictionary, skipped: ', + nod.encode('utf-8', errors='replace')) + + labeled_nodes_idx = sorted(labeled_nodes_idx) + labels = labels.tocsr() + + save_sparse_csr(labels_file, labels) + + np.save(train_idx_file, train_idx) + np.save(test_idx_file, test_idx) + + np.save(train_names_file, train_names) + np.save(test_names_file, test_names) + + with open(rel_dict_file, 'wb') as fp: # pylint: disable=invalid-name + pkl.dump(relations_dict, fp) + with open(nodes_file, 'wb') as fp: # pylint: disable=invalid-name + pkl.dump(nodes, fp) + + features = sp.identity(adj_shape[0], format='csr') + + return adjacencies, features, labels, labeled_nodes_idx, train_idx, test_idx, \ + relations_dict, train_names, test_names + + +def parse(symbol): + """ Parse < symbol as shift """ + if symbol.startswith('<'): + return symbol[1:-1] + return symbol diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/prepare_dataset.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/prepare_dataset.py new file mode 100644 index 0000000000000000000000000000000000000000..b49a63de62890a425e962c04c31000b1936ae783 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/prepare_dataset.py @@ -0,0 +1,87 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Prepare dataset main file to convert dataset to train format. """ + +import os +import sys +import time +import argparse + +import pickle as pkl + +import scipy.sparse as sp # pylint: disable=import-error + +import utils # pylint: disable=import-error + +from data_utils import load_data # pylint: disable=import-error + + +if __name__ == '__main__': + + ap = argparse.ArgumentParser() + ap.add_argument('-d', '--dataset', type=str, default='aifb', + help='Dataset string (aifb, mutag, bgs, am)') + + args = vars(ap.parse_args()) + + print(args) + + # Define parameters + DATASET = args['dataset'] + + # Get data + A, X, y, labeled_nodes_idx, train_idx, test_idx, rel_dict, train_names, test_names = load_data( + DATASET) + + rel_list = list(range(len(A))) + for key, value in rel_dict.items(): + if value * 2 >= len(A): + continue + rel_list[value * 2] = key + rel_list[value * 2 + 1] = key + '_INV' + + num_nodes = A[0].shape[0] + A.append(sp.identity(A[0].shape[0]).tocsr()) # add identity matrix + + support = len(A) + + print('Relations used and their frequencies' + str([a.sum() for a in A])) + + print('Calculating level sets...') + t = time.time() + # Get level sets (used for memory optimization) + bfs_generator = utils.bfs_relational(A, labeled_nodes_idx) + lvls = [] + lvls.append(set(labeled_nodes_idx)) + lvls.append(set.union(*next(bfs_generator))) + print('Done! Elapsed time ' + str(time.time() - t)) + + # Delete unnecessary rows in adjacencies for memory efficiency + todel = list(set(range(num_nodes)) - set.union(lvls[0], lvls[1])) + for i, _ in enumerate(A): + utils.csr_zero_rows(A[i], todel) + + data = {'A': A, + 'y': y, + 'train_idx': train_idx, + 'test_idx': test_idx + } + + dirname = os.path.dirname(os.path.realpath(sys.argv[0])) + + with open(dirname + '/' + DATASET + '.pickle', 'wb') as f: + pkl.dump(data, f, pkl.HIGHEST_PROTOCOL) diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/utils.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..cb01dfeb4ad24b43fe1c52d7aec4a915aa406bce --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/datasets/utils.py @@ -0,0 +1,149 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Datasets utils. """ + +import numpy as np # pylint: disable=import-error +import scipy.sparse as sp # pylint: disable=import-error + + +def csr_zero_rows(csr, rows_to_zero): + """ Set rows given by rows_to_zero in a sparse csr matrix to zero. + NOTE: Inplace operation! Does not return a copy of sparse matrix. """ + rows, _ = csr.shape + mask = np.ones((rows,), dtype=np.bool) + mask[rows_to_zero] = False + nnz_per_row = np.diff(csr.indptr) + + mask = np.repeat(mask, nnz_per_row) + nnz_per_row[rows_to_zero] = 0 + csr.data = csr.data[mask] + csr.indices = csr.indices[mask] + csr.indptr[1:] = np.cumsum(nnz_per_row) + csr.eliminate_zeros() + return csr + + +def csc_zero_cols(csc, cols_to_zero): + """ Set rows given by cols_to_zero in a sparse csc matrix to zero. + NOTE: Inplace operation! Does not return a copy of sparse matrix. """ + _, cols = csc.shape + mask = np.ones((cols,), dtype=np.bool) + mask[cols_to_zero] = False + nnz_per_row = np.diff(csc.indptr) + + mask = np.repeat(mask, nnz_per_row) + nnz_per_row[cols_to_zero] = 0 + csc.data = csc.data[mask] + csc.indices = csc.indices[mask] + csc.indptr[1:] = np.cumsum(nnz_per_row) + csc.eliminate_zeros() + return csc + + +def sp_vec_from_idx_list(idx_list, dim): + """ Create sparse vector of dimensionality dim from a list of indices. """ + shape = (dim, 1) + data = np.ones(len(idx_list)) + row_ind = list(idx_list) + col_ind = np.zeros(len(idx_list)) + return sp.csr_matrix((data, (row_ind, col_ind)), shape=shape) + + +def sp_row_vec_from_idx_list(idx_list, dim): + """ Create sparse vector of dimensionality dim from a list of indices. """ + shape = (1, dim) + data = np.ones(len(idx_list)) + row_ind = np.zeros(len(idx_list)) + col_ind = list(idx_list) + return sp.csr_matrix((data, (row_ind, col_ind)), shape=shape) + + +def get_neighbors(adj, nodes): + """ Takes a set of nodes and a graph adjacency matrix and returns a set of neighbors. """ + sp_nodes = sp_row_vec_from_idx_list(list(nodes), adj.shape[1]) + sp_neighbors = sp_nodes.dot(adj) + neighbors = set(sp.find(sp_neighbors)[1]) # convert to set of indices + return neighbors + + +def bfs(adj, roots): + """ + Perform BFS on a graph given by an adjaceny matrix adj. + Can take a set of multiple root nodes. + Root nodes have level 0, first-order neighors have level 1, and so on.] + """ + visited = set() + current_lvl = set(roots) + while current_lvl: + for v in current_lvl: # pylint: disable=invalid-name + visited.add(v) + + next_lvl = get_neighbors(adj, current_lvl) + next_lvl -= visited # set difference + yield next_lvl + + current_lvl = next_lvl + + +def bfs_relational(adj_list, roots): + """ + BFS for graphs with multiple edge types. Returns list of level sets. + Each entry in list corresponds to relation specified by adj_list. + """ + visited = set() + current_lvl = set(roots) + + next_lvl = [] + for rel, _ in enumerate(adj_list): + next_lvl.append(set()) + + while current_lvl: + + for v in current_lvl: # pylint: disable=invalid-name + visited.add(v) + + for rel, _ in enumerate(adj_list): + next_lvl[rel] = get_neighbors(adj_list[rel], current_lvl) + next_lvl[rel] -= visited # set difference + + yield next_lvl + + current_lvl = set.union(*next_lvl) + + +def bfs_sample(adj, roots, max_lvl_size): + """ + BFS with node dropout. Only keeps random subset of nodes per level up to max_lvl_size. + 'roots' should be a mini-batch of nodes (set of node indices). + + NOTE: In this implementation, not every node in the mini-batch is guaranteed to have + the same number of neighbors, as we're sampling for the whole batch at the same time. + """ + print(max_lvl_size) + visited = set(roots) + current_lvl = set(roots) + while current_lvl: + + next_lvl = get_neighbors(adj, current_lvl) + next_lvl -= visited # set difference + + for v in next_lvl: # pylint: disable=invalid-name + visited.add(v) + + yield next_lvl + + current_lvl = next_lvl diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/modelzoo_level.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/modelzoo_level.txt new file mode 100644 index 0000000000000000000000000000000000000000..d9efc3f080f2ca19da3c045992cf78a8fb9d7074 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/modelzoo_level.txt @@ -0,0 +1,3 @@ +FuncStatus:OK +PerfStatus:POK +PrecisionStatus:OK diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/requirements.txt b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..4037dc3f1cac912e465e8cecfa526de9af758c4a --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/requirements.txt @@ -0,0 +1,8 @@ +pandas == 1.3.4 +numpy == 1.21.3 +scipy == 1.7.1 +tensorflow == 1.15.0 + +rdflib == 6.0.2 +wget +matplotlib \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/hyperparameters.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/hyperparameters.py new file mode 100644 index 0000000000000000000000000000000000000000..7c18e516d942dbb33dda361e484e1332448a2ca5 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/hyperparameters.py @@ -0,0 +1,88 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Hyperparameters definitions. """ + +import argparse + +ap = argparse.ArgumentParser() +ap.add_argument('-d', '--dataset', type=str, default='aifb', + help='Dataset string (aifb, mutag, bgs, am)') +ap.add_argument('-de', '--device', type=str, default='cpu', + help='Device to be used string (cpu, gpu, npu)') +ap.add_argument('-did', '--device-id', type=int, default='0', + help='If gpu or npu is used, you can specify the device (0)') +ap.add_argument('-e', '--epochs', type=int, default=50, + help='Number training epochs (50)') +ap.add_argument('-hd', '--hidden', type=int, default=16, + help='Number hidden units (16)') +ap.add_argument('-do', '--dropout', type=float, default=0., + help='Dropout rate (0.)') +ap.add_argument('-b', '--bases', type=int, default=-1, + help='Number of bases used (-1: all)') +ap.add_argument('-lr', '--learnrate', type=float, default=0.01, + help='Learning rate (0.01)') +ap.add_argument('-l2', '--l2norm', type=float, default=0., + help='L2 normalization of input weights (0.)') +ap.add_argument('-sp', '--sparse', type=bool, default=True, + help='Use as many sparse tensors as possible (True)') +ap.add_argument('-co', '--compacted', type=bool, default=True, + help='Use only non empty adjacency matrices (True)') +ap.add_argument('-es', '--early-stopping', type=bool, default=False, + help='Stop early if validation loss is not improving (False)') +ap.add_argument('-spe', '--stats-per-epoch', type=int, default=5, + help='Show stats of the training per x epochs (5)') +ap.add_argument('-o', '--out-path', type=str, default='./results', + help='Sets the output path for data to be written on disk') + +fp = ap.add_mutually_exclusive_group(required=False) +fp.add_argument('--validation', dest='validation', action='store_true') +fp.add_argument('--testing', dest='validation', action='store_false') +ap.set_defaults(validation=True) + +fp = ap.add_mutually_exclusive_group(required=False) +fp.add_argument('--production', dest='production', action='store_true') +fp.add_argument('--debug', dest='production', action='store_false') +ap.set_defaults(production=False) + +args = vars(ap.parse_args()) + +# Define parameters +DATASET = args['dataset'] +EPOCHS = args['epochs'] +VALIDATION = args['validation'] +LEARNINGRATE = args['learnrate'] +L2REGULARIZER = args['l2norm'] +HIDDEN = args['hidden'] +BASES = args['bases'] +DROPOUT = args['dropout'] +DEVICE = args['device'] +DEVICE_ID = args['device_id'] +SPARSE = args['sparse'] +COMPACTED = args['compacted'] +EARLYSTOPPING = args['early_stopping'] +STATSPEREPOCH = args['stats_per_epoch'] +DEBUG = not args['production'] +OUT = args['out_path'] + +if DEBUG: + print(args) + +# Check parameters +assert DEVICE in ['cpu', 'gpu', 'npu'], 'Unknown device type' +assert EPOCHS >= 1, 'Epochs must be larger than 0' +assert 0 <= DROPOUT < 1, 'Dropout must be in [0, 1)' +# assert EARLYSTOPPING <= EPOCHS, 'Early stopping has to be equal or lower than epochs' diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/graph.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/graph.py new file mode 100644 index 0000000000000000000000000000000000000000..0d26f7cbf92c158889c34f0491f68563dbe96ef6 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/graph.py @@ -0,0 +1,227 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Graph Convolution Model and Layers. """ + +import tensorflow as tf # pylint: disable=import-error + + +class GraphConvolutionLayer(tf.Module): + + """ Graph Convolution Layer Class """ + + def __init__(self, + input_dim, # number of inputs per node + output_dim, # number of feature maps per node + support=1, # filter support / number of weights + featureless=False, # use/ignore input features + init='glorot_uniform', # init for variables + activation='linear', # optional activation + weights=None, # optional use predefined weights + num_bases=-1, # optional less bases than support + bias=False, # optional use bias before activation + sparse=False, # optional use sparse tensors + dropout=0.0, # optional use dropout + dropout_fn=None, # optional use own dropout function + name='GCL', # optional name for the layer + **kwargs): + + # Check for assertions + assert support >= 1 + + self.num_bases = num_bases + self.num_nodes = support + self.support = support + + self.activation = activation + self.sparse = sparse + + self.__name = name + + self.input_dim = input_dim + self.output_dim = output_dim + + self.featureless = featureless + self.dropout = dropout + self.dropout_fn = dropout_fn if dropout_fn else tf.nn.dropout + + if init == 'glorot_uniform': + init = tf.glorot_uniform_initializer() + else: + init = tf.random_normal_initializer() + + if weights: + def tmp_init(): + return weights + init = tmp_init + + if self.num_bases > 0: + kernel_temp = [tf.Variable(init([input_dim, output_dim]), + shape=[input_dim, output_dim]) + for _ in range(self.num_bases)] + self.kernel = tf.concat(kernel_temp, axis=0, name='W') + + self.kernel_comp = tf.Variable(init([self.support, self.num_bases]), + shape=[self.support, self.num_bases], + name='W_comp') + else: + kernel_temp = [tf.Variable(init([input_dim, output_dim]), + shape=[input_dim, output_dim]) + for _ in range(self.support)] + self.kernel = tf.concat(kernel_temp, axis=0, name='W') + + self.bias = None + if bias: + self.bias = tf.Variable(tf.zeros([1, output_dim]), + shape=[1, output_dim], name='b') + + super(GraphConvolutionLayer, self).__init__(**kwargs) + + + def _matmul_(self, left, right): + """ MatMul depending on class settings """ + if self.sparse and self.featureless: + tmp_name = 'SPARSE_MATMUL_' + self.__name # temporary name to debug + output = tf.sparse.sparse_dense_matmul(left, right, name=tmp_name) + else: + output = tf.matmul(left, right, name='MATMUL_' + self.__name) + return output + + + def __call__(self, x, training=False): # pylint: disable=invalid-name + """ forward call """ + features = x[0] + adjacencies = x[1] + adjacencies_con = x[2] + + num_nodes = features.shape[0] + num_features = features.shape[1] + + idc = [None] * self.support + supports = [None] * self.support + + for i in range(self.support): + if not self.featureless: + if self.sparse: + name_debug = 'FEATURE_MATMUL_' + self.__name # temporary name to debug + adj = adjacencies[i] + matmul_tensor = tf.sparse.sparse_dense_matmul(adj, features, name=name_debug) + else: + matmul_tensor = tf.matmul(adjacencies[i], features) + # supports[i] = matmul_tensor # code for tf.concat + + matmul_tensor = tf.transpose(matmul_tensor) + + idc[i] = [i] + supports[i] = matmul_tensor + elif not self.sparse: + adj = tf.transpose(adjacencies[i]) + + idc[i] = [i] + supports[i] = adj + # else: + # supports[i] = adjacencies[i] # code for tf.concat + + if self.featureless: + supports = adjacencies_con + else: + # workaround to not use tf.concat + # supports = tf.concat(supports, axis=1, name='CONCAT_' + self.__name) + supports_shape = (self.support, num_features, num_nodes) + supports = tf.scatter_nd(idc, supports, supports_shape) + supports = tf.reshape(supports, (self.support * num_features, num_nodes)) + supports = tf.transpose(supports) + + if self.num_bases > 0: + kernel = tf.reshape(self.kernel, + (self.num_bases, self.input_dim, self.output_dim)) + kernel = tf.transpose(kernel, perm=(1, 0, 2)) + kernel_base = tf.matmul(self.kernel_comp, kernel) + kernel_base = tf.reshape(kernel_base, (self.support * self.input_dim, self.output_dim)) + output = self._matmul_(supports, kernel_base) + else: + output = self._matmul_(supports, self.kernel) + + # if featureless add dropout to output, + # by elementwise multiplying with column vector of ones, + # with dropout applied to the vector of ones. + if training and self.featureless and self.dropout > 0.0: + tmp = tf.ones(num_features) + tmp_do = self.dropout_fn(tmp, keep_prob=(1 - self.dropout)) + output = tf.transpose(tf.transpose(output) * tmp_do) + + if self.bias: + output = tf.add(output, self.bias) + + if self.activation != 'linear': + output = self.activation(output, name=self.__name) + + return output + + +class RelationalGraphConvolutionModel(tf.Module): + + """ Relational Graph Convolution Model Class """ + + def __init__(self, + input_dim, + output_dim, + hidden_dim, + support, + num_bases=-1, + featureless=True, + dropout=0., + dropout_fn=None, + sparse=True, + model_name='RGCN'): + + self.input_dim = input_dim + self.output_dim = output_dim + self.hidden_dim = hidden_dim + + self.dropout = dropout + self.dropout_fn = dropout_fn if dropout_fn else tf.nn.dropout + self.sparse = sparse + + self.model_name = model_name + + self.graph_conv_1 = GraphConvolutionLayer( + self.input_dim, self.hidden_dim, support=support, + featureless=featureless, activation=tf.nn.relu, + num_bases=num_bases, sparse=sparse, name='GCL1') + + self.graph_conv_2 = GraphConvolutionLayer( + self.hidden_dim, self.output_dim, support=support, + featureless=False, + num_bases=num_bases, sparse=sparse, name='GCL2') + + super(RelationalGraphConvolutionModel, self).__init__() + + + def __call__(self, x, training=False): # pylint: disable=invalid-name + """ forward call """ + features = x[0] + adjacencies = x[1] + adjacencies_con = x[2] + + hidden = self.graph_conv_1((features, adjacencies, adjacencies_con)) + + if training and self.dropout_fn and self.dropout > 0.0: + hidden = self.dropout_fn(hidden, keep_prob=(1 - self.dropout)) + + logits = self.graph_conv_2((hidden, adjacencies, adjacencies_con)) + + return logits diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/input.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/input.py new file mode 100644 index 0000000000000000000000000000000000000000..d70115b2b9224ca8826f9a0923669af93f889a96 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/layers/input.py @@ -0,0 +1,76 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Sparse Input Placeholders. """ + +import numpy as np # pylint: disable=import-error +import tensorflow as tf # pylint: disable=import-error +import scipy.sparse as sp # pylint: disable=import-error + + +if tf.__version__ == '1.15.0': + + variable_scope = tf.compat.v1.variable_scope + placeholder = tf.compat.v1.placeholder + +else: + + variable_scope = tf.variable_scope + placeholder = tf.placeholder + + +def create_input_placeholders(data, name, sparse=False, convert=True): + """ Create Input Placeholder for data """ + + dense_shape = data.shape + + if sparse and convert: + data, idx, nnz = convert_data_to_sparse(data) + + if nnz == 0: + data = [0.0] + idx = np.array([[1, 1]]) + nnz = 1 + else: + idx = 0 + nnz = None + + with variable_scope('Inputs', reuse=True): + if sparse: + place_data = placeholder(tf.float32, (nnz,), name=name + '_data') + place_idx = placeholder(tf.int64, (nnz, 2), name=name + '_idx') + place_shape = dense_shape + else: + place_data = placeholder(tf.float32, data.shape, name=name + '_data') + place_idx = placeholder(tf.int64, (), name=name + '_idx') + place_shape = dense_shape + + return place_data, place_idx, place_shape, data, idx, nnz, dense_shape + + +def convert_data_to_sparse(data): + """ Convert data tensor to sparse data tensor """ + + data_coo = data + if not sp.issparse(data_coo): + data_coo = sp.coo_matrix(data_coo) + elif not sp.isspmatrix_coo(data_coo): + data_coo = data_coo.tocoo() + data = data_coo.data + idx = np.array([data_coo.row, data_coo.col]).T + nnz = data_coo.count_nonzero() + + return data, idx, nnz diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/metrics.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/metrics.py new file mode 100644 index 0000000000000000000000000000000000000000..2296e1655c780bf88b9c4c811c76876e5e96d1b6 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/metrics.py @@ -0,0 +1,151 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Metrics class to collect and show metrics. """ + +from datetime import datetime + +import numpy as np # pylint: disable=import-error + +import matplotlib.pyplot as plt # pylint: disable=import-error +from matplotlib.ticker import FormatStrFormatter # pylint: disable=import-error + + +class CaptureMetrics: + """ Capture metrics class """ + + _metrics = { + 'train_loss': [], + 'valid_loss': [], + 'valid_acc': [], + 'test_acc': [], + } + + _timers = { + 'epoch_timer': [], + 'batch_timer': [], + 'inference_timer': [], + } + + _start_timers = { + 'overall': None, + 'epoch_start': None, + 'batch_start': None, + } + + def __init__(self, name, output='results'): + """ Init method """ + self.name = name + self.output = output + self.start_timer = datetime.now() + + + def set_overall_timer(self): + """ Set the overall timer """ + self._start_timers['overall'] = datetime.now() + + + def set_epoch_timer(self): + """ Set the initial epoch timer """ + self._start_timers['epoch_start'] = datetime.now() + + + def add_epoch_timer(self): + """ Finish an epoch based on pervious timer """ + if self._start_timers['epoch_start']: + epoch_timer = (datetime.now() - self._start_timers['epoch_start']).total_seconds() + self._timers['epoch_timer'].append(epoch_timer) + self._start_timers['epoch_start'] = datetime.now() + else: + self._start_timers['epoch_start'] = datetime.now() + + + def set_batch_timer(self): + """ Set the starting batch timer """ + self._start_timers['batch_start'] = datetime.now() + + + def add_batch_timer(self): + """ Add a batch timer """ + if self._start_timers['batch_start']: + batch_timer = (datetime.now() - self._start_timers['batch_start']).total_seconds() + self._timers['batch_timer'].append(batch_timer) + self._start_timers['batch_start'] = datetime.now() + else: + self._start_timers['batch_start'] = datetime.now() + + + def add_metric_value(self, metric, value): + """ Add value to a metric list """ + if metric in self._metrics: + self._metrics[metric].append(value) + else: + self._metrics[metric] = [] + self._metrics[metric].append(value) + + + def print_metrics(self): + """ Print collected metrics """ + print_str = '' + for key, value in self._metrics.items(): + if len(value) > 0: + print_str += f'{key}={np.mean(value):.4f} ' + print(print_str) + + + def draw_metrics(self): + """ Draw collected metrics in a plot """ + keys = [k for k, v in self._metrics.items() if len(v) > 0] + splits = list({k.split('_')[1] for k in keys}) + plots = {} + for s in splits: # pylint: disable=invalid-name + plots[s] = [k for k in keys if s in k] + subplots = len(splits) + fig, axs = plt.subplots(nrows=subplots, ncols=1) + fig.suptitle(f'Metrics for {self.name}') + for i, s in enumerate(plots): # pylint: disable=invalid-name + for k in plots[s]: + axs[i].plot(self._metrics[k], label=f'{k}') + start, end = axs[i].get_ylim() + axs[i].yaxis.set_ticks(np.arange(start, end, (end - start) / 5.001)) + axs[i].yaxis.set_major_formatter(FormatStrFormatter('%.2f')) + axs[i].legend() + tmp_start_timer = self.start_timer.strftime('%Y-%m-%d-%H-%M-%S') + plt.savefig(self.output + f'/{tmp_start_timer}_{self.name}_metrics_output.png') + + + def draw_timers(self): + """ Draw collected timers in a plot """ + keys = [k for k, v in self._timers.items() if len(v) > 0] + subplots = len(keys) + fig, axs = plt.subplots(nrows=subplots, ncols=1) + fig.suptitle(f'Timers for {self.name}') + if subplots > 1: + for i, k in enumerate(keys): + axs[i].plot(self._timers[k], label=f'{k}') + axs[i].yaxis.set_major_formatter(FormatStrFormatter('%.2f')) + axs[i].axhline(y=np.median(self._timers[k]), color='r', linestyle='-', + label=str(np.median(self._timers[k]))) + axs[i].legend() + else: + for i, k in enumerate(keys): + axs.plot(self._timers[k], label=f'{k}') + axs.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) + axs.axhline(y=np.median(self._timers[k]), color='r', linestyle='-', + label=str(np.median(self._timers[k]))) + axs.legend() + tmp_start_timer = self.start_timer.strftime('%Y-%m-%d-%H-%M-%S') + plt.savefig(self.output + f'/{tmp_start_timer}_{self.name}_timers_output.png') diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/train.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/train.py new file mode 100644 index 0000000000000000000000000000000000000000..20ea38561886f1f458f6a5f8a92b9a0772462c4d --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/train.py @@ -0,0 +1,366 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Main train file. """ + +from __future__ import print_function + +import os +import sys +import time +from datetime import datetime +import pickle as pkl +import numpy as np # pylint: disable=import-error +import scipy.sparse as sp # pylint: disable=import-error + +import tensorflow as tf # pylint: disable=import-error + +from tensorflow.python.framework.graph_util import convert_variables_to_constants # pylint: disable=import-error +from tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig # pylint: disable=import-error + +# self +import hyperparameters as hp # pylint: disable=import-error + +import utils # pylint: disable=import-error +import metrics # pylint: disable=import-error + +import layers.graph as graph_in # pylint: disable=import-error +from layers.input import create_input_placeholders # pylint: disable=import-error + +if hp.DEBUG: + os.environ['TF_CPP_MIN_LOG_LEVEL'] = '0' +else: + import warnings + + warnings.filterwarnings('ignore') + + os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' + + +if tf.__version__ == '1.15.0': + + Session = tf.compat.v1.Session + summaryFileWriter = tf.compat.v1.summary.FileWriter + trainSaver = tf.compat.v1.train.Saver + trainable_variables = tf.compat.v1.trainable_variables + global_variables_initializer = tf.compat.v1.global_variables_initializer + softmax_cross_entropy_with_logits = tf.nn.softmax_cross_entropy_with_logits_v2 + AdamOptimizer = tf.compat.v1.train.AdamOptimizer + +else: + + Session = tf.Session + summaryFileWriter = tf.summary.FileWriter + trainSaver = tf.train.Saver + trainable_variables = tf.trainable_variables + global_variables_initializer = tf.global_variables_initializer + softmax_cross_entropy_with_logits = tf.nn.softmax_cross_entropy_with_logits + AdamOptimizer = tf.train.AdamOptimizer + + +# Set defaults +np.random.seed(1) + +if hp.DEVICE == 'npu': + from npu_bridge.estimator import npu_ops # pylint: disable=import-error + + dropout_fn = npu_ops.dropout + +else: + dropout_fn = tf.nn.dropout + + +def set_device(): + """ Set device configs """ + + config = tf.compat.v1.ConfigProto() + if hp.DEVICE == 'npu': + + if str(os.environ['ASCEND_DEVICE_ID']) == '': + os.environ['ASCEND_DEVICE_ID'] = str(hp.DEVICE_ID) + + use_profiling = False + + profiling_mode = use_profiling + profiling_options = '{"output":"results","task_trace":"on"}' + + custom_op = config.graph_options.rewrite_options.custom_optimizers.add() + + custom_op.name = 'NpuOptimizer' + custom_op.parameter_map['use_off_line'].b = True + custom_op.parameter_map['precision_mode'].s = tf.compat.as_bytes('allow_mix_precision') + custom_op.parameter_map['profiling_mode'].b = profiling_mode + custom_op.parameter_map['profiling_options'].s = tf.compat.as_bytes(profiling_options) + + config.graph_options.rewrite_options.remapping = RewriterConfig.OFF + + elif hp.DEVICE == 'gpu': + + os.environ['CUDA_VISIBLE_DEVICES'] = str(hp.DEVICE_ID) + + config.gpu_options.allow_growth = True + + elif hp.DEVICE == 'cpu': + + os.environ['CUDA_VISIBLE_DEVICES'] = str('') + + return config + + +def create_feed_dict(X, A, A_con, y_train, train_idx, train_mask, placeholders): # pylint: disable=invalid-name + """ Create feed dict for the session run """ + + feed_dict = {} + + feed_dict[placeholders['X'][0]] = X[0] + feed_dict[placeholders['X'][1]] = X[1] + + for i, adj in enumerate(A): # pylint: disable=invalid-name + feed_dict[placeholders['A'][i][0]] = adj[0] + feed_dict[placeholders['A'][i][1]] = adj[1] + + feed_dict[placeholders['A_con'][0]] = A_con[0] + feed_dict[placeholders['A_con'][1]] = A_con[1] + + feed_dict[placeholders['y_train']] = y_train + feed_dict[placeholders['train_idx']] = train_idx + feed_dict[placeholders['train_mask']] = train_mask + + return feed_dict + + +def _print(*kargs, **kwargs): + """ Conditional print if debug is true """ + + if hp.DEBUG: + print(*kargs, **kwargs) + + +def normalize_adj(A): # pylint: disable=invalid-name + """ Normalize adjacency matrix """ + A_compacted = [] # pylint: disable=invalid-name + # Normalize adjacency matrices individually + for i, _ in enumerate(A): + d = np.array(A[i].sum(1)).flatten() # pylint: disable=invalid-name + d_inv = 1. / d # pylint: disable=invalid-name + d_inv[np.isinf(d_inv)] = 0. + D_inv = sp.diags(d_inv) # pylint: disable=invalid-name + A[i] = D_inv.dot(A[i]).tocsr().astype('float32') + if not hp.SPARSE: + A[i] = A[i].toarray() + if A[i].count_nonzero() > 0: + A_compacted.append(A[i]) + + if hp.COMPACTED: + A_bak = A # pylint: disable=invalid-name + A = A_compacted # pylint: disable=invalid-name + + return A, A_bak + + +def main(): # pylint: disable=too-many-statements + """ Main function in which the train works """ + + _print('-' * 5, 'Using device', hp.DEVICE, '-' * 5) + + config = set_device() + + dirname = os.path.dirname(os.path.realpath(sys.argv[0])) + + out_path = hp.OUT + if not os.path.exists(out_path): + os.mkdir(out_path) + + with open(dirname + '/' + hp.DATASET + '.pickle', 'rb') as f: # pylint: disable=invalid-name + data = pkl.load(f) + + A = data['A'] # pylint: disable=invalid-name + y = data['y'] # pylint: disable=invalid-name + train_idx = data['train_idx'] + test_idx = data['test_idx'] + + # Get dataset splits + y_train, y_val, y_test, idx_train, idx_val, idx_test = utils.get_splits(y, train_idx, + test_idx, + hp.VALIDATION) + train_mask = utils.sample_mask(idx_train, y.shape[0]) + train_idx = np.array(idx_train) + + # Define empty dummy feature matrix (input is ignored as we set featureless=True) + # In case features are available, define them here and set featureless=False. + X = sp.csr_matrix(A[0].shape).astype('float32') # pylint: disable=invalid-name + if not hp.SPARSE: + X = X.toarray() # pylint: disable=invalid-name + + A, A_bak = normalize_adj(A) # pylint: disable=invalid-name + + num_nodes = A[0].shape[0] + support = len(A) + + input_dim = X.shape[1] + output_dim = y_train.shape[1] + + A_con = sp.hstack(A) # pylint: disable=invalid-name + + _print('Input, Support, Org Support, Num Nodes, Output') + _print(input_dim, support, len(A_bak), num_nodes, output_dim) + + placeholders = {} + + G = tf.Graph() # pylint: disable=invalid-name + sess = Session(graph=G, config=config) + with G.as_default(): + + # Create Input Placeholders + X_in = create_input_placeholders(X, name='X', sparse=hp.SPARSE) # pylint: disable=invalid-name + + placeholders['X'] = [X_in[0], X_in[1]] + X_data = [X_in[3], X_in[4]] # pylint: disable=invalid-name + + A_in = [ # pylint: disable=invalid-name + create_input_placeholders(A[i], name='A' + str(i), sparse=hp.SPARSE) + for i in range(support) # pylint: disable=invalid-name + ] # pylint: disable=invalid-name + + placeholders['A'] = [[A[0], A[1]] for A in A_in] # pylint: disable=invalid-name + A_data = [[A[3], A[4]] for A in A_in] # pylint: disable=invalid-name + + A_con_in = create_input_placeholders(A_con, name='A_con', sparse=hp.SPARSE) # pylint: disable=invalid-name + + placeholders['A_con'] = [A_con_in[0], A_con_in[1]] + A_con_data = [A_con_in[3], A_con_in[4]] # pylint: disable=invalid-name + + with tf.variable_scope('Inputs', reuse=True): + train_mask_in = tf.placeholder(tf.int32, train_mask.shape, name='train_mask') + train_idx_in = tf.placeholder(tf.int32, train_idx.shape, name='train_idx') + y_train_in = tf.placeholder(tf.int32, y_train.shape, name='y_train') + + placeholders['train_mask'] = train_mask_in + placeholders['train_idx'] = train_idx_in + placeholders['y_train'] = y_train_in + + # Create Sparse Tensors for Placeholders if needed + if hp.SPARSE: + X_sp = tf.sparse.SparseTensor(X_in[1], X_in[0], X_in[2]) # pylint: disable=invalid-name + A_sp = [tf.sparse.SparseTensor(A[1], A[0], A[2]) for A in A_in] # pylint: disable=invalid-name + + A_con_sp = tf.sparse.SparseTensor(A_con_in[1], A_con_in[0], A_con_in[2]) # pylint: disable=invalid-name + else: + X_sp = X_in[0] # pylint: disable=invalid-name + A_sp = [A[0] for A in A_in] # pylint: disable=invalid-name + + A_con_sp = A_con_in[0] # pylint: disable=invalid-name + + # Create Model + model = graph_in.RelationalGraphConvolutionModel( + input_dim=input_dim, output_dim=output_dim, hidden_dim=hp.HIDDEN, + support=support, num_bases=hp.BASES, dropout=0.0, dropout_fn=dropout_fn, + ) + + logits = model((X_sp, A_sp, A_con_sp), training=True) + + masked_logits = tf.gather(logits, train_idx_in, name='masked_logits') + masked_labels = tf.gather(y_train_in, train_idx_in, name='masked_labels') + + cross_entropy = softmax_cross_entropy_with_logits( + labels=masked_labels, logits=masked_logits) + loss_op = tf.reduce_mean(cross_entropy, name='cross_entropy_loss') + + if hp.L2REGULARIZER > 0: + l2_loss = tf.nn.l2_loss(model.graph_conv_1.kernel) + loss_op = tf.add(loss_op, l2_loss * hp.L2REGULARIZER, name='combined_loss') + + optimizer = AdamOptimizer(learning_rate=hp.LEARNINGRATE) + train_op = optimizer.minimize(loss_op, var_list=trainable_variables()) + + logits_predict = model((X_sp, A_sp, A_con_sp), training=False) + softmax_prediction = tf.nn.softmax(logits_predict, name='softmax_prediction') + + sess.run(global_variables_initializer()) + saver = trainSaver(trainable_variables()) + + # writer = summaryFileWriter(logdir='logdir', graph=G) + # writer.flush() + + cap_metrics = metrics.CaptureMetrics(hp.DEVICE + '-' + hp.DATASET) + cap_metrics.set_overall_timer() + + print('Start Training') + + loss_vals = [] + timer = datetime.now() + for epoch in range(1, hp.EPOCHS + 1): + + cap_metrics.set_epoch_timer() + + epoch_timer = time.time() + + input_dict = create_feed_dict(X_data, A_data, A_con_data, y_train, + train_idx, train_mask, placeholders) + + loss_train, _ = sess.run([loss_op, train_op], feed_dict=input_dict) + print(f'Loss={loss_train:.6f}', f'Time={(time.time() - epoch_timer):.3f}') + + preds = sess.run(softmax_prediction, feed_dict=input_dict) + loss_train_val, acc_train_val = utils.evaluate_preds(preds, + [y_train, y_val], + [idx_train, idx_val]) + loss_train, loss_val = loss_train_val + acc_train, acc_val = acc_train_val + loss_vals.append(loss_val) + + cap_metrics.add_epoch_timer() + + cap_metrics.add_metric_value('train_loss', loss_train) + cap_metrics.add_metric_value('val_loss', loss_val) + cap_metrics.add_metric_value('train_acc', acc_train) + cap_metrics.add_metric_value('val_acc', acc_val) + + if epoch % hp.STATSPEREPOCH == 0: + timer_ = (datetime.now() - timer).total_seconds() + print(f'Epoch ({epoch}) [{(time.time() - epoch_timer):.4f}]: ' + f'loss_train={loss_train:.4f}, loss_val={loss_val:.4f}, ' + f'acc_train={acc_train:.2f}, acc_val={acc_val:.2f}, ' + f'epochs_per_second={(epoch / timer_):.4g}') + + if hp.EARLYSTOPPING \ + and epoch > 10 and loss_vals[-1] > np.mean(loss_vals[-10:-2]): + print('Validation loss not improving anymore, stop here:', loss_vals[-10:-2]) + break + + time_needed = (datetime.now() - timer).total_seconds() + print(f'Training ended after {epoch} epochs. Time elapsed: {time_needed}') + + # Testing + test_loss, test_acc = utils.evaluate_preds(preds, [y_test], [idx_test]) + print(f'Test set results: loss_test={test_loss[0]:.4f} acc_test={test_acc[0]:.4f}') + + cap_metrics.draw_metrics() + cap_metrics.draw_timers() + + ckpt_path = os.path.join(out_path, f'rgcn_{hp.DATASET}_checkpoint') + saver.save(sess, ckpt_path, global_step=None) + + pb_path = os.path.join(out_path, f'rgcn_{hp.DATASET}_constant_graph.pb') + constant_graph = convert_variables_to_constants(sess, sess.graph_def, ['softmax_prediction']) + with tf.gfile.FastGFile(pb_path, mode='wb') as f: # pylint: disable=invalid-name + f.write(constant_graph.SerializeToString()) + + sess.close() + + +if __name__ == '__main__': + main() diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/utils.py b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/utils.py new file mode 100644 index 0000000000000000000000000000000000000000..a7f48dd47579251e7629bf4b6bc0291737a8651b --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/rgcn/utils.py @@ -0,0 +1,121 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +""" Utils for data splitting. """ + +import numpy as np # pylint: disable=import-error +import scipy.sparse as sp # pylint: disable=import-error + + +def get_splits(y, train_idx, test_idx, validation=True): # pylint: disable=invalid-name + """ Split data into train and test y """ + + # Make dataset splits + # np.random.shuffle(train_idx) + if validation: + idx_train = train_idx[len(train_idx) / 5:] + idx_val = train_idx[:len(train_idx) / 5] + idx_test = idx_val # report final score on validation set for hyperparameter optimization + else: + idx_train = train_idx + idx_val = train_idx # no validation + idx_test = test_idx + + y_train = np.zeros(y.shape) + y_val = np.zeros(y.shape) + y_test = np.zeros(y.shape) + + y_train[idx_train] = np.array(y[idx_train].todense()) + y_val[idx_val] = np.array(y[idx_val].todense()) + y_test[idx_test] = np.array(y[idx_test].todense()) + + return y_train, y_val, y_test, idx_train, idx_val, idx_test + + +def normalize_adj(adj, symmetric=True): + """ Normalize adjacency matrix """ + + if symmetric: + diag = sp.diags(np.power(np.array(adj.sum(1)), -0.5).flatten()) + a_norm = adj.dot(diag).transpose().dot(diag).tocsr() + else: + diag = sp.diags(np.power(np.array(adj.sum(1)), -1).flatten()) + a_norm = diag.dot(adj).tocsr() + + return a_norm + + +def preprocess_adj(adj, symmetric=True): + """ Preprocess adjacency matrix """ + + adj = normalize_adj(adj, symmetric) + + return adj + + +def sample_mask(idx, l): # pylint: disable=invalid-name + """ Create sample mask """ + + mask = np.zeros(l) + mask[idx] = 1 + + return np.array(mask, dtype=np.bool) + + +def categorical_crossentropy(preds, labels): + """ Categorical crossentropy """ + return np.mean(-np.log(np.extract(labels, preds))) + + +def binary_crossentropy(preds, labels): + """ Binary crossentropy """ + return np.mean(-labels*np.log(preds) - (1-labels)*np.log(1-preds)) + + +def two_class_accuracy(preds, labels, threshold=0.5): + """ Binary accuracy """ + return np.mean(np.equal(labels, preds > threshold)) + + +def accuracy(preds, labels): + """ Accuracy """ + return np.mean(np.equal(np.argmax(labels, 1), np.argmax(preds, 1))) + + +def evaluate_preds(preds, labels, indices): + """ Evaluate predictions with categorical crossentropy and accuracy """ + + split_loss = [] + split_acc = [] + + for y_split, idx_split in zip(labels, indices): + split_loss.append(categorical_crossentropy(preds[idx_split], y_split[idx_split])) + split_acc.append(accuracy(preds[idx_split], y_split[idx_split])) + + return split_loss, split_acc + + +def evaluate_preds_sigmoid(preds, labels, indices): + """ Evaluate predictions with binary crossentropy and binary accuracy """ + + split_loss = [] + split_acc = [] + + for y_split, idx_split in zip(labels, indices): + split_loss.append(binary_crossentropy(preds[idx_split], y_split[idx_split])) + split_acc.append(two_class_accuracy(preds[idx_split], y_split[idx_split])) + + return split_loss, split_acc diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/config_npu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/config_npu.sh new file mode 100644 index 0000000000000000000000000000000000000000..4daa82e36faf4c8f9ed012bdff75292eccf99d71 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/config_npu.sh @@ -0,0 +1,35 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Tao Wu + +# CANN v5.0.x +export install_path=$HOME/Ascend/nnae/latest +export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver:$LD_LIBRARY_PATH +export PATH=${install_path}/fwkacllib/ccec_compiler/bin:${install_path}/fwkacllib/bin:$PATH +export LD_LIBRARY_PATH=${install_path}/fwkacllib/lib64:$LD_LIBRARY_PATH +export PYTHONPATH=${install_path}/fwkacllib/python/site-packages:$PYTHONPATH +export PYTHONPATH=$HOME/Ascend/tfplugin/latest/tfplugin/python/site-packages:$PYTHONPATH +export ASCEND_OPP_PATH=${install_path}/opp +export ASCEND_AICPU_PATH=${install_path} + +export SOC_VERSION=Ascend910 +export JOB_ID=10089 +export ASCEND_DEVICE_ID=0 +export ASCEND_GLOBAL_LOG_LEVEL=1 + +# For debugging only +# export DUMP_GE_GRAPH=1 +# export DUMP_GRAPH_LEVEL=1 +# export DUMP_GRAPH_PATH=./dumps diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_aifb.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_aifb.sh new file mode 100644 index 0000000000000000000000000000000000000000..f9c4e60990766145a4d330c6d19d7d581ea8bc0c --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_aifb.sh @@ -0,0 +1,18 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +cd ./datasets/ +python prepare_dataset.py -d aifb \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_am.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_am.sh new file mode 100644 index 0000000000000000000000000000000000000000..b27c3c4706bca329a4a6fc166b49deba5f26ceb8 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_am.sh @@ -0,0 +1,18 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +cd ./datasets/ +python prepare_dataset.py -d am \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_bgs.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_bgs.sh new file mode 100644 index 0000000000000000000000000000000000000000..d51f2c221a2536aea0e74be45db678267b0f31d3 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_bgs.sh @@ -0,0 +1,18 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +cd ./datasets/ +python prepare_dataset.py -d bgs \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_mutag.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_mutag.sh new file mode 100644 index 0000000000000000000000000000000000000000..0dd472c4f3b35cd06050d127e426164d962ffc2e --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/datasets/prepare_mutag.sh @@ -0,0 +1,18 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +cd ./datasets/ +python prepare_dataset.py -d mutag \ No newline at end of file diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_cpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_cpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..a1ed7553546c4c454d4f78e1a487553ad95960b4 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_cpu.sh @@ -0,0 +1,26 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AIFB dataset on CPU +python3.7 rgcn/train.py \ + -d aifb \ + -de cpu \ + --dropout 0.0 \ + --bases 0 \ + --hidden 16 \ + --l2norm 0.0 \ + --testing \ + --debug diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_gpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..290f9212a6ebd7b5dce94eb812beaf2df7eaf339 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_gpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AIFB dataset on GPU +python3.7 rgcn/train.py \ + -d aifb \ + -de gpu \ + --bases 0 \ + --hidden 16 \ + --l2norm 0.0 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_npu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_npu.sh new file mode 100644 index 0000000000000000000000000000000000000000..6c1324ce60fe37619053eda30b19344e3ad2b6a9 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_aifb_npu.sh @@ -0,0 +1,26 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AIFB dataset on NPU +source scripts/config_npu.sh +python3.7 rgcn/train.py \ + -d aifb \ + -de npu \ + --dropout 0.0 \ + --bases 0 \ + --hidden 20 \ + --l2norm 0.0 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_cpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_cpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..65fc2228129db3eda7e39a14a68d04b94272085b --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_cpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AM dataset on CPU +python3.7 rgcn/train.py \ + -d am \ + -de cpu \ + --bases 40 \ + --hidden 10 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_gpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..ea198be0976451048e0fa8602eb42e632dc1f2df --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_gpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AM dataset on GPU +python3.7 rgcn/train.py \ + -d am \ + -de gpu \ + --bases 40 \ + --hidden 10 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_npu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_npu.sh new file mode 100644 index 0000000000000000000000000000000000000000..aaabf3b9ad1f692f4bc3a34aae9bdffb3ed79429 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_am_npu.sh @@ -0,0 +1,25 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with AM dataset on NPU +source scripts/config_npu.sh +python3.7 rgcn/train.py \ + -d am \ + -de npu \ + --bases 40 \ + --hidden 20 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_cpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_cpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..1ac423a5be117b41457bdc647d7e97bdb7529cb0 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_cpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with BGS dataset on CPU +python3.7 rgcn/train.py \ + -d bgs \ + -de cpu \ + --bases 40 \ + --hidden 16 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_gpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..22cadc075337690d7c37721863c0b9887b3e2c67 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_gpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with BGS dataset on GPU +python3.7 rgcn/train.py \ + -d bgs \ + -de gpu \ + --bases 40 \ + --hidden 16 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_npu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_npu.sh new file mode 100644 index 0000000000000000000000000000000000000000..29f1db8421f123ae23931480751ae761ed07221a --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_bgs_npu.sh @@ -0,0 +1,25 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with BGS dataset on NPU +source scripts/config_npu.sh +python3.7 rgcn/train.py \ + -d bgs \ + -de npu \ + --bases 40 \ + --hidden 20 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_cpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_cpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..af64bafcc578709fee0b197431c7095cb23dfb8c --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_cpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with MUTAG dataset on CPU +python3.7 rgcn/train.py \ + -d mutag \ + -de cpu \ + --bases 30 \ + --hidden 16 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_gpu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_gpu.sh new file mode 100644 index 0000000000000000000000000000000000000000..80e90dc500d30f63e9ccc28e38d7ffc99754a079 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_gpu.sh @@ -0,0 +1,24 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with MUTAG dataset on GPU +python3.7 rgcn/train.py \ + -d mutag \ + -de gpu \ + --bases 30 \ + --hidden 16 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_npu.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_npu.sh new file mode 100644 index 0000000000000000000000000000000000000000..46fc59415e7bdb93509bca2c52192bf305352d7a --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/scripts/train_mutag_npu.sh @@ -0,0 +1,25 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +# Author: Udo Schlegel + +# Train RGCN with MUTAG dataset on NPU +source scripts/config_npu.sh +python3.7 rgcn/train.py \ + -d mutag \ + -de npu \ + --bases 30 \ + --hidden 16 \ + --l2norm 5e-4 \ + --testing diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/env.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/env.sh new file mode 100644 index 0000000000000000000000000000000000000000..21419b7717cd61ca312bedbc47e16610d29b9593 --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/env.sh @@ -0,0 +1,8 @@ +export install_path=$HOME/Ascend/nnae/latest +export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64:/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver:$LD_LIBRARY_PATH +export PATH=${install_path}/fwkacllib/ccec_compiler/bin:${install_path}/fwkacllib/bin:$PATH +export LD_LIBRARY_PATH=${install_path}/fwkacllib/lib64:$LD_LIBRARY_PATH +export PYTHONPATH=${install_path}/fwkacllib/python/site-packages:$PYTHONPATH +export PYTHONPATH=$HOME/Ascend/tfplugin/latest/tfplugin/python/site-packages:$PYTHONPATH +export ASCEND_OPP_PATH=${install_path}/opp +export ASCEND_AICPU_PATH=${install_path} diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_full_1p.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_full_1p.sh new file mode 100644 index 0000000000000000000000000000000000000000..af21ac5b002b4415631af99ab0f833fdee5deb1e --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_full_1p.sh @@ -0,0 +1,177 @@ +#!/bin/bash + +#当前路径,不需要修改 +cur_path=`pwd` + +#集合通信参数,不需要修改 +export RANK_SIZE=1 +export JOB_ID=10087 +RANK_ID_START=0 + + +# 数据集路径,保持为空,不需要修改 +data_path="./data" + +#基础参数,需要模型审视修改 +#网络名称,同目录名称 +Network="RGCN_for_Tensorflow" + +# config_file=res50_32bs_1p_host_full +# max_train_steps=2000 +# iterations_per_loop=1000 +# debug=True +# eval=True +#参考config +batch_size='full' +export DEVICE_ID=${RANK_ID_START} +export ASCEND_DEVICE_ID=${DEVICE_ID} +DEVICE_INDEX=$(( DEVICE_ID + RANK_INDEX * 1)) +export DEVICE_INDEX=${DEVICE_INDEX} + +#TF2.X独有,需要模型审视修改 +# export NPU_LOOP_SIZE=${train_steps} + +#维测参数,precision_mode需要模型审视修改 +precision_mode="allow_mix_precision" +#维持参数,以下不需要修改 +over_dump=False +data_dump_flag=False +data_dump_step="10" +profiling=False + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == -h ]];then + echo"usage:./train_full_1p.sh " + echo " " + echo "parameter explain: + --over_dump if or not over detection, default is False + --data_dump_flag data dump flag, default is False + --data_dump_step data dump step, default is 10 + --profiling if or not profiling for performance debug, default is False + --data_path source data of training + -h/--help show help message + " + exit 1 +fi + +#参数校验,不需要修改 +for para in $* +do + if [[ $para == --precision_mode* ]];then + precision_mode=`echo ${para#*=}` + elif [[ $para == --over_dump* ]];then + over_dump=`echo ${para#*=}` + over_dump_path=${cur_path}/output/overflow_dump + mkdir -p ${over_dump_path} + elif [[ $para == --data_dump_flag* ]];then + data_dump_flag=`echo ${para#*=}` + data_dump_path=${cur_path}/output/data_dump + mkdir -p ${data_dump_path} + elif [[ $para == --data_dump_step* ]];then + data_dump_step=`echo ${para#*=}` + elif [[ $para == --profiling* ]];then + profiling=`echo ${para#*=}` + profiling_dump_path=${cur_path}/output/profiling + mkdir -p ${profiling_dump_path} + elif [[ $para == --data_path* ]];then + data_path=`echo ${para#*=}` + fi +done + +#校验是否传入data_path,不需要修改 +if [[ $data_path == "" ]];then + echo "[Error] \"data_path\" is missing" + exit 1 +fi + +#训练开始时间,不需要修改 +start_time=$(date +%s) + +#进入训练脚本目录,需要模型审视修改 +cd $cur_path/.. +for((RANK_ID=$RANK_ID_START;RANK_ID<$((RANK_SIZE+RANK_ID_START));RANK_ID++)); +do + #设置环境变量,不需要修改 + echo "Device ID: $ASCEND_DEVICE_ID" + export RANK_ID=$RANK_ID + + + + #创建DeviceID输出目录,不需要修改 + if [ -d ${cur_path}/output/${ASCEND_DEVICE_ID} ];then + rm -rf ${cur_path}/output/${ASCEND_DEVICE_ID} + mkdir -p ${cur_path}/output/$ASCEND_DEVICE_ID/ckpt + else + mkdir -p ${cur_path}/output/$ASCEND_DEVICE_ID/ckpt + fi + # 绑核,不需要的绑核的模型删除,需要的模型审视修改 + let a=RANK_ID*12 + let b=RANK_ID+1 + let c=b*12-1 + + #执行训练脚本,以下传参不需要修改,其他需要模型审视修改 + #--data_dir, --model_dir, --precision_mode, --over_dump, --over_dump_path,--data_dump_flag,--data_dump_step,--data_dump_path,--profiling,--profiling_dump_path + cd ${cur_path} + nohup python3.7 rgcn/train.py \ + --device npu \ + --dataset bgs \ + --device-id ${ASCEND_DEVICE_ID} \ + --bases 40 \ + --hidden 20 \ + --l2norm 5e-4 \ + --testing \ + --out-path=${cur_path}/output \ + > ${cur_path}/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log 2>&1 & +done +wait + +# Create an archive containing model file and input data files +# DATASET=scripts/datasets/prepare_bgs.sh + + + +#训练结束时间,不需要修改 +end_time=$(date +%s) +e2e_time=$(( $end_time - $start_time )) + +echo "------------------ Final result ------------------" +#输出性能FPS,需要模型审视修改 +FPS=`grep "epochs_per_second" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d"=" -f 6` +#打印,不需要修改 +echo "Final Performance epochs/sec : $FPS" +#训练精度,需要从train_$ASCEND_DEVICE_ID.log里,通过关键字获取。需要模型审视修改 +train_accuracy=`grep "acc_train" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d"=" -f 4 | cut -d"," -f 1` +echo "Final Train Accuracy: ${train_accuracy}" +#E2E训练端到端时长,直接计算,不需要修改 +echo "E2E Training Duration sec: $e2e_time" + +# 训练用例信息,不需要修改 +BatchSize=${batch_size} +DeviceType=`uname -m` +CaseName=${Network}_bs${BatchSize}_${RANK_SIZE}'p'_'acc' + +# #获取性能数据,不需要修改 +# 吞吐量 +ActualFPS=${FPS} +# 单迭代训练时长 +TrainingTime=`grep "Training ended" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d" " -f 8` +ActualTrainSteps=`grep "Training ended" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | tail -1 | cut -d" " -f 4` +TrainingTime=`awk "BEGIN {print $TrainingTime/$ActualTrainSteps}"` +#获取Loss,通过train_*.log中关键字,需要根据模型审视 +grep "loss_train" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | cut -d"=" -f 2 | cut -d"," -f 1 >> $cur_path/output/$ASCEND_DEVICE_ID/train_${CaseName}_loss.txt + +# 最后一个迭代loss值,不需要修改 +ActualLoss=`grep "loss_train" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | tail -1 | cut -d"=" -f 2 | cut -d"," -f 1` + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${BatchSize}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = ${DeviceType}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${ActualFPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${TrainingTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainAccuracy = ${train_accuracy}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log + diff --git a/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_performance_1p.sh b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_performance_1p.sh new file mode 100644 index 0000000000000000000000000000000000000000..1d7fab0f1e82bfc67e893c3936caaeb9863178ab --- /dev/null +++ b/TensorFlow/built-in/gnn/RGCN_for_Tensorflow/test/train_performance_1p.sh @@ -0,0 +1,177 @@ +#!/bin/bash + +#当前路径,不需要修改 +cur_path=`pwd` + +#集合通信参数,不需要修改 +export RANK_SIZE=1 +export JOB_ID=10087 +RANK_ID_START=0 + + +# 数据集路径,保持为空,不需要修改 +data_path="./data" + +#基础参数,需要模型审视修改 +#网络名称,同目录名称 +Network="RGCN_for_Tensoflow" + +# config_file=res50_32bs_1p_host_full +max_train_steps=10 +# iterations_per_loop=1000 +# debug=True +# eval=True +#参考config +batch_size='full' +export DEVICE_ID=${RANK_ID_START} +export ASCEND_DEVICE_ID=${DEVICE_ID} +DEVICE_INDEX=$(( DEVICE_ID + RANK_INDEX * 1)) +export DEVICE_INDEX=${DEVICE_INDEX} + +#TF2.X独有,需要模型审视修改 +# export NPU_LOOP_SIZE=${train_steps} + +#维测参数,precision_mode需要模型审视修改 +precision_mode="allow_mix_precision" +#维持参数,以下不需要修改 +over_dump=False +data_dump_flag=False +data_dump_step="10" +profiling=False + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == -h ]];then + echo"usage:./train_full_1p.sh " + echo " " + echo "parameter explain: + --over_dump if or not over detection, default is False + --data_dump_flag data dump flag, default is False + --data_dump_step data dump step, default is 10 + --profiling if or not profiling for performance debug, default is False + --data_path source data of training + -h/--help show help message + " + exit 1 +fi + +#参数校验,不需要修改 +for para in $* +do + if [[ $para == --precision_mode* ]];then + precision_mode=`echo ${para#*=}` + elif [[ $para == --over_dump* ]];then + over_dump=`echo ${para#*=}` + over_dump_path=${cur_path}/output/overflow_dump + mkdir -p ${over_dump_path} + elif [[ $para == --data_dump_flag* ]];then + data_dump_flag=`echo ${para#*=}` + data_dump_path=${cur_path}/output/data_dump + mkdir -p ${data_dump_path} + elif [[ $para == --data_dump_step* ]];then + data_dump_step=`echo ${para#*=}` + elif [[ $para == --profiling* ]];then + profiling=`echo ${para#*=}` + profiling_dump_path=${cur_path}/output/profiling + mkdir -p ${profiling_dump_path} + elif [[ $para == --data_path* ]];then + data_path=`echo ${para#*=}` + fi +done + +#校验是否传入data_path,不需要修改 +if [[ $data_path == "" ]];then + echo "[Error] \"data_path\" is missing" + exit 1 +fi + +#训练开始时间,不需要修改 +start_time=$(date +%s) + +#进入训练脚本目录,需要模型审视修改 +cd $cur_path/.. +for((RANK_ID=$RANK_ID_START;RANK_ID<$((RANK_SIZE+RANK_ID_START));RANK_ID++)); +do + #设置环境变量,不需要修改 + echo "Device ID: $ASCEND_DEVICE_ID" + export RANK_ID=$RANK_ID + + + + #创建DeviceID输出目录,不需要修改 + if [ -d ${cur_path}/output/${ASCEND_DEVICE_ID} ];then + rm -rf ${cur_path}/output/${ASCEND_DEVICE_ID} + mkdir -p ${cur_path}/output/$ASCEND_DEVICE_ID/ckpt + else + mkdir -p ${cur_path}/output/$ASCEND_DEVICE_ID/ckpt + fi + # 绑核,不需要的绑核的模型删除,需要的模型审视修改 + let a=RANK_ID*12 + let b=RANK_ID+1 + let c=b*12-1 + + #执行训练脚本,以下传参不需要修改,其他需要模型审视修改 + #--data_dir, --model_dir, --precision_mode, --over_dump, --over_dump_path,--data_dump_flag,--data_dump_step,--data_dump_path,--profiling,--profiling_dump_path + cd ${cur_path} + nohup python3.7 rgcn/train.py \ + --device npu \ + --dataset bgs \ + --device-id ${ASCEND_DEVICE_ID} \ + --bases 40 \ + --hidden 20 \ + --l2norm 5e-4 \ + --testing \ + --epochs ${max_train_steps} \ + --out-path=${cur_path}/output \ + > ${cur_path}/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log 2>&1 & +done +wait + +# Create an archive containing model file and input data files +# DATASET=scripts/datasets/prepare_bgs.sh + + + +#训练结束时间,不需要修改 +end_time=$(date +%s) +e2e_time=$(( $end_time - $start_time )) + +echo "------------------ Final result ------------------" +#输出性能FPS,需要模型审视修改 +FPS=`grep "epochs_per_second" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d"=" -f 6` +#打印,不需要修改 +echo "Final Performance epochs/sec : $FPS" +#训练精度,需要从train_$ASCEND_DEVICE_ID.log里,通过关键字获取。需要模型审视修改 +train_accuracy=`grep "acc_train" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d"=" -f 4 | cut -d"," -f 1` +echo "Final Train Accuracy: ${train_accuracy}" +#E2E训练端到端时长,直接计算,不需要修改 +echo "E2E Training Duration sec: $e2e_time" + +# 训练用例信息,不需要修改 +BatchSize=${batch_size} +DeviceType=`uname -m` +CaseName=${Network}_bs${BatchSize}_${RANK_SIZE}'p'_'perf' + +# #获取性能数据,不需要修改 +# 吞吐量 +ActualFPS=${FPS} +# 单迭代训练时长 +TrainingTime=`grep "Training ended" $cur_path/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log | tail -1 | cut -d" " -f 8` +ActualTrainSteps=`grep "Training ended" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | tail -1 | cut -d" " -f 4` +TrainingTime=`awk "BEGIN {print $TrainingTime/$ActualTrainSteps}"` +#获取Loss,通过train_*.log中关键字,需要根据模型审视 +grep "loss_train" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | cut -d"=" -f 2 | cut -d"," -f 1 >> $cur_path/output/$ASCEND_DEVICE_ID/train_${CaseName}_loss.txt + +# 最后一个迭代loss值,不需要修改 +ActualLoss=`grep "loss_train" $cur_path/output/$ASCEND_DEVICE_ID/train_$ASCEND_DEVICE_ID.log | tail -1 | cut -d"=" -f 2 | cut -d"," -f 1` + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${BatchSize}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = ${DeviceType}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${ActualFPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${TrainingTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainAccuracy = ${train_accuracy}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log