Wikidata:Property proposal/Natural science

Property proposal: Generic Authority control Person Organization
Creative work Place Sports Sister projects
Transportation Natural science Computing Lexeme

See also

edit

This page is for the proposal of new properties.

Before proposing a property

  1. Search if the property already exists.
  2. Search if the property has already been proposed.
  3. Check if you can give a similar label and definition as an existing Wikipedia infobox parameter, or if it can be matched to an infobox, to or from which data can be transferred automatically.
  4. Select the right datatype for the property.
  5. Read Wikidata:Creating a property proposal for guidelines you should follow when proposing new property.
  6. Start writing the documentation based on the preload form below by editing the two templates at the top of the page to add proposal details.

Creating the property

  1. Once consensus is reached, change status=ready on the template, to attract the attention of a property creator.
  2. Creation can be done 1 week after the creation of the proposal, by a property creator or an administrator.
  3. See property creation policy.



Physics/astronomy

edit

‎SIMBAD catalog properties (used more than 1 million times)

edit

Gaia Data Release 2 ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in Gaia Data Release 2
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{18}
Example 1BS Cnc (Q2889194)661284024235415808
Example 2Gliese 450 (Q5880899)4031586157514097024
Example 3TYC 3645-2080-1 (Q75838267)1943381923013901440
SourceGaia Data Release 2 (Q51905050)
Planned usemigrate all P528 values qualified with P972 Q51905050 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=Gaia%20DR2%20$1

2MASS ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in the Two Micron All Sky Survey
Data typeExternal identifier
Domainastronomical objects
Allowed valuesJ[0-9]{8}[+-][0-9]{7}
Example 1BS Cnc (Q2889194)J08390909+1935327
Example 2Gliese 450 (Q5880899)J11510737+3516188
Example 3TYC 3645-2080-1 (Q75838267)J23350993+4851114
Source2MASS (Q1454942)
Planned usemigrate all P528 values qualified with P972 Q1454942 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=2MASS%20$1

Tycho-2 Catalogue ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in the Tycho-2 Catalogue
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{1,4}-[0-9]{1,4}-1
Example 1BS Cnc (Q2889194)1395-2445-1
Example 2Gliese 450 (Q5880899)2526-2357-1
Example 3TYC 3645-2080-1 (Q75838267)3645-2080-1
SourceThe Tycho-2 catalogue of the 2.5 million brightest stars (Q2725928)
Planned usemigrate all P528 values qualified with P972 Q2725928 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=TYC%20$1

Gaia Data Release 1 ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in Gaia Data Release 1
Data typeExternal identifier
Domainastronomical objects
Allowed values[0-9]{18}
Example 1BS Cnc (Q2889194)661284019938140032
Example 2Gliese 450 (Q5880899)4031586157514097024
Example 3TYC 3645-2080-1 (Q75838267)1943381923012780160
SourceGaia Data Release 1 (Q37859523)
Planned usemigrate all P528 values qualified with P972 Q37859523 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=Gaia%20DR1%20$1

SDSS object ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in the Sloan Digital Sky Survey
Data typeExternal identifier
Domainastronomical objects
Allowed valuesJ[0-9]{6}\.[0-9]{2}[+-][0-9]{7}\.[0-9]
Example 1BS Cnc (Q2889194)J083909.03+193532.4
Example 2Gliese 450 (Q5880899)J115106.57+351627.2
Example 3TYC 3645-2080-1 (Q75838267)J233509.93+485111.4
SourceSloan Digital Sky Survey (Q840332)
Planned usemigrate all P528 values qualified with P972 Q840332 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=SDSS%20$1

OGLE-III object ID

edit
   Under discussion
Descriptionidentifier for an astronomical object in the Optical Gravitational Lensing Experiment
Data typeExternal identifier
Domainastronomical objects
Example 1R99 (Q22087000)BRIGHT-LMC-MISC-429
Example 2R85 (Q28406638)BRIGHT-LMC-MISC-9
Example 3SV* HV 2827 (Q74703824)LMC-CEP-4689
SourceThe Optical Gravitational Lensing Experiment. The OGLE-III catalog of variable stars. I. Classical Cepheids in the Large Magellanic Cloud (Q67054966)
Planned usemigrate all P528 values qualified with P972 Q67054966 to this property
Formatter URLhttps://simbad.u-strasbg.fr/simbad/sim-id?Ident=OGLE%20$1

Motivation

edit

The specific combination of catalog code (P528) qualified by catalog (P972) is used in 24 million statements, the vast majority of which are for astronomical objects. About 14 million of these statements come from six catalogues, so migrating those statements to use these properties would remove the 14 million triples taken up by the P972 qualifiers. (Another 18 catalogues have more statements than the number of statements for inventory number (P217) with qualifier collection (P195) The Palace Museum (Q2047427)—127545 as of 6 August 2024.)

(This migration would similar to the migration that took place after the properties proposed at Wikidata:Property proposal/proper motion components were created. While this page intends to handle only the six largest catalogues, if you believe there are other large catalogues whose catalog codes would do well to be migrated to properties, please say so in a comment.) Mahir256 (talk) 21:56, 6 August 2024 (UTC)[reply]

Discussion

edit
@Mahir256 Is there any specific reason why we want to reduce number of P528 statements? Ghuron (talk) 00:03, 7 August 2024 (UTC)[reply]
@Ghuron: We have dedicated external identifier properties rather than lumping them all in a single property and qualifying them, just as we have dedicated website account properties rather than always using website account on (P553) qualified with website username or ID (P554). This proposal is intended as a logical parallel of both of those decisions. Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]
@Mahir256: Let me rephrase how I understood your rationalization: if p:P528/pq:P972 wd:Q51905050 occurs more than a million times, then it is both a necessary and sufficient condition for creating a new property, since it reduces the number of triplets and thus reduces the risk of Blazegraph crashing. Is that a correct summary? Ghuron (talk) 22:44, 12 August 2024 (UTC)[reply]
@Ghuron: I would not phrase it quite so absolutely, but I do want to see the number of triples reduced and believe this is a way to do it; an extremely high number of identically structured uses of a generic identification property like catalog code (P528) with the same qualifiers suggests that a more specialized identifier property is worth introducing to streamline things, just as has been done multiple times before. Mahir256 (talk) 16:50, 13 August 2024 (UTC)[reply]
As stated by Ghuron, is there any reason why we need to reduce the number of P528 statements? In the first place there are millions of Gaia IDs because of the import of the Simbad database (I am NOT against this import btw).
Also, I wonder why only some catalogues would have their own properties. This will create a weird in-between for catalogues in P258 vs catalogues having their own properties. This makes no sense imo.
Romuald 2 (talk) 15:31, 8 August 2024 (UTC)[reply]
  • There is nothing wrong with having separate external id properties for most used identifiers with the correct "url formatter".
    But I have 2 major objections:
  1. I don't see any reason to use https://simbad.u-strasbg.fr/simbad/sim-id?Ident= as a url. Those items that are on simbad, we already have Property:P3083 with the link to simbad. Those rare items that are not on simbad, this link will result in 404
  2. Having in mind (1) it would make sense to link to really useful external storages, that are only partially synchronized with simbad (like HyperLEDA or Gaia Archive). And that leads us to question about proposed set of properties:
    1. Why did we choose Gaia DR2, because this is only temporary IDs, permanent are Gaia DR3?
    2. Why did we choose Tycho-2, they pretty much 100% imported in Simbad?
Ghuron (talk) 12:52, 9 August 2024 (UTC)[reply]
  • @Romuald 2: Reducing the number of RDF triples that Wikidata consists of is generally a good thing, as there is a lot of discussion going on about the health of the Query Service and how reducing the number of triples that a single running Blazegraph instance holds is generally a good thing. Also I had noted that there were 18 other catalogs with more entries than the most frequent inventory number source; I only didn't add them to this page because it would have got too long. If these six go through, then I will promptly propose properties for those 18 (and as I stated in the motivation above, if you believe there are other large catalogues whose catalog codes would do well to be migrated to properties, please say so in a comment). Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]
    @Ghuron: The reason I selected the SIMBAD formatter URL is that the external IDs I tried with that URL all seemed to resolve to the right objects; if there are in fact objects for which this resolution doesn't work, it would be great if you could name some. The caveat "(used more than 1 million times)" in the title of this property proposal page is important; because your imports did not yield more than 1 million Gaia DR3 identifiers, I did not think to propose a property for it here, though I'd gladly support one for Gaia DR3 if you think it would be useful. I don't know who "we" is as regards either Gaia DR2 or Tycho-2; you're the one who mass-imported the objects, so I'm working with the catalog codes I see on those objects. Mahir256 (talk) 17:18, 12 August 2024 (UTC)[reply]

Biology

edit
Please visit Wikidata:WikiProject Taxonomy for more information. To notify participants use {{Ping project|Taxonomy}}
Please visit Wikidata:WikiProject Biology for more information. To notify participants use {{Ping project|Biology}}

‎Larval host plant

edit
   Under discussion
DescriptionLarval host plant - used only for insects - subclass of P1034
Data typeItem
Domaininsect (Q1390)
Allowed valuesplant (Q756)
Example 1Pterophorus (Q3007776)Larval host plantConvolvulaceae (Q145777)
Example 2Adela septentrionella (Q4681639)Larval host plantHolodiscus discolor (Q3139473)
Example 3Platyptilia direptalis (Q14121153)Larval host plantTeucrium quadrifarium (Q15245892)
Sourcerelevant scientific literature/URL
See alsomain food source (P1034), host (P2975), has natural reservoir (P1605), has biological vector (P11231)

Motivation

edit

Planning to ingest a large database of lepidopteran hosts - so far maintained by the NHM - and released under CC0 - https://data.nhm.ac.uk/dataset/hosts - it would pollute if we use P1034 and it would not be precise because adult lepidoptera are largely nectar feeders and are not very species specific about the flowers but larvae can be extremely specific and show coevolutionary trends between the hosts and the feeders (which might be visualizable from this data). This is my first ever property proposal so please feel free to make corrections where needed or let me know what else I ought to read up. I might have missed information on cardinality and constraints. This needs to be a subclass of P1034, not sure if that needs to be added in the template. Shyamal (talk) 01:58, 20 August 2024 (UTC)[reply]

Discussion

edit

A qualifier for "in geography" could be added that could be used when species use different hosts in different regions. Shyamal (talk) 02:19, 20 August 2024 (UTC)[reply]


  Comment Thank you for the proposal! As you mention, we do have main food source (P1034), but also host (P2975) and has biological vector (P11231). I think that ingesting the database would be of great value, but maybe we can use host (P2975) with a qualifier, such as applies to part (P518). A second modelling option — and one that I like in particular — is to use object of statement has role (P3831) as qualifier for the statement.


E.g. Adela septentrionella (Q4681639) host (P2975) Holodiscus discolor (Q3139473) object of statement has role (P3831) larval host plant (Q129425539).


In that way, we would have a flexible structure for different kinds of host relations and we would not need any additional properties. Would that make sense for your needs? TiagoLubiana (talk) 14:59, 20 August 2024 (UTC)[reply]

Have I understood your suggestion right at this entry - Adela septentrionella (Q4681639)? Shyamal (talk) 01:29, 21 August 2024 (UTC)[reply]
So it would be useful if it could all be queried finally to produce a trophic interaction network, so perhaps main food source (P1034) should also ideally be included under a similar structure. Shyamal (talk) 01:32, 21 August 2024 (UTC)[reply]
@TiagoLubiana: - host (P2975) is already indeed being used for host plant in some lepidopteran examples so that does look like the way to go. I have set up a file of quickstatements about 140485 rows - if you think the above is good to go, I will start the run. Shyamal (talk) 10:23, 21 August 2024 (UTC)[reply]
That does look good! A few ideas:
- Perhaps create an item for the HOSTS dataset and use it as value for the stated in (P248) reference too.
- It would be great if the original source of the information could be pulled too, but it seems it is not available on HOSTS.
- 140.000 rows are quite some rows for quickstatements! Maybe a bot would be in order? You could ask for a bot request here: https://www.wikidata.org/wiki/Wikidata:Requests_for_permissions/Bot. You can also run it on Quickstatements, as it is a one-of kind of thing (though rules are hazy), but bots are usually faster and a bit more flexible than QS.
- If going the quickstatements route, perhaps run a smaller batch first (100? 1000?) just to make sure everything is aligned with your expectations.
Also, maybe we should move this conversation from here to some talk page. Perhaps in https://www.wikidata.org/wiki/Wikidata:WikiProject_Biodiversity? TiagoLubiana (talk) 17:26, 26 August 2024 (UTC)[reply]
Sorry, seeing this late. The import run has been completed. I don't know about what the standard practice is because there is no discipline based separation of discussion areas (beyond "natural science"). Shyamal (talk) 15:06, 10 September 2024 (UTC)[reply]

‎mode of reproduction

edit
   Under discussion
Descriptionways for living organisms to propagate or produce their offsprings
Data typeItem
Domainitem
Allowed valuesitem
Example 1mammal (Q7377)sexual reproduction (Q182353)
Example 2bacteria (Q10876)cell division (Q188909)
Example 3plant (Q756)asexual reproduction (Q173432)
Example 4plant (Q756)sexual reproduction (Q182353)
Planned useWould like to enable specifying mode(s) of reproduction for any organism or taxon via this property, preferably with references.
Expected completenessalways incomplete (Q21873886)

Motivation

edit

Currently, for the hundreds of thousands of Wikidata records related to taxa or organisms, there is no easy way to specify the mode of reproduction. This proposed property is intended to fill a gap. --Zhenqinli (talk) 04:37, 30 August 2024 (UTC)[reply]

Discussion

edit

  Notified participants of WikiProject Biology. –Samoasambia 09:33, 30 August 2024 (UTC)[reply]

Agreed that there is no need to specify this property for every species. For some, specification at the highest level of taxons would suffice. However, there is a great deal of diversity and variability in the biological world. Even just for vertebrates, the mode of reproduction could be: oviparity (Q212306), viviparity (Q120446), and ovoviviparity (Q192805). In short, this property would provide an option for clarifications when more explicit explanation(s) are needed. --Zhenqinli (talk) 13:28, 30 August 2024 (UTC)[reply]
Thanks for the feedbacks. Indeed, having has characteristic (P1552) with any subclass of mode of biological reproduction (Q130077803) is better than having no information regarding an organism's mode(s) of reproduction in Wikidata. Currently they are almost 300 taxon-related properties. Many of them could have been implemented in similar ways as suggested. In my personal opinion though, having a roundabout way to state a key feature of an organism, is not ideal. --Zhenqinli (talk) 21:46, 1 September 2024 (UTC)[reply]
P.S. The description of has characteristic (P1552) does mention: "Use a more specific property when possible". This property is currently used in more than 200,000 statements, without constraints on subject (organism or taxon) or value (mode of reproduction) as this proposal would prefer. These facts will likely discourage systematic input of useful data and eventual WDQS query of mode of reproduction information using this property in Wikidata. --Zhenqinli (talk) 02:25, 2 September 2024 (UTC)[reply]
  •   Support; Zhenqinli makes a strong case against using has characteristic (P1552). However, the proposal should be revised to reflect Andy's note – it's standard practice to apply statements only at the highest class (or taxon) at which they are universally true (and sometimes even higher, with qualification like nature of statement (P5102)=often (Q28962312)), a principle that Example 1 (at least) violates. [Edit: fixed 18:17, 12 September 2024 (UTC)] It doesn't seem like this property carries any special encouragement to violate that principle, but if it does, that could be addressed in a property usage note. Swpb (talk) 17:56, 9 September 2024 (UTC)[reply]
Agree that in the first example, Homo sapiens (Q15978631) should probably be replaced by mammal (Q7377). As parent taxon (P171) is a subproperty of subclass of (P279), statements describing organisms at higher taxon ranks do not need to be re-stated at lower ranks of the class, so there will be no redundancy issue. --Zhenqinli (talk) 18:49, 9 September 2024 (UTC)[reply]
  • I hope anyone who still has reservation about this proposal could help clarify if there are remaining open issues or alternatives to be discussed further. While diel cycle (P9566) does have more than 284,000 statements for animals, I believe this proposed property for all living organisms should require far less statements, since mode of reproduction is typically more well-defined biologically and commonly stated at higher taxon ranks than diel cycle (diel cycle could also be modified due to domestication). --Zhenqinli (talk) 18:09, 12 September 2024 (UTC)[reply]

Biochemistry/molecular biology

edit
Please visit Wikidata:WikiProject Molecular biology for more information. To notify participants use {{Ping project|Molecular biology}}

Chemistry

edit
Please visit Wikidata:WikiProject Chemistry for more information. To notify participants use {{Ping project|Chemistry}}

‎chemical formula

edit

  Notified participants of WikiProject Chemistry

Motivation

edit

This proposal addresses the need for improved data structure and maintenance within Wikidata’s chemical compound data. Currently, the Wikidata:WikiProject Chemistry manages approximately 1 million chemical items, with many of them linked to chemical formula (P274) and mass (P2067). The main issues are:

Redundancy in Data: With about 300,000 unique chemical formula strings in use, redundancy is a significant problem. Some strings are associated with over 1,000 items, which complicates data management (see https://w.wiki/B2ax).

Efficiency and Maintenance: Transitioning from string-based formulas to item-based ones will simplify maintenance, reduce redundancy, and optimize query performance, especially for SPARQL queries involving formulas or masses.

Data Optimization: Moving mass (P2067) statements to the newly created formula items will reduce the number of triples and make data management more efficient. Additionally, this change will facilitate the use of different units for masses and allow for better structured data.

Improved Modeling: Switching to item-based formulas could eliminate the need for overly complex has part(s) (P527) statements on chemicals, allowing cleaner, more precise data models (e.g., identifying all chemical formulas containing more than five oxygen atoms).

This change is expected to bring numerous benefits, including reduced redundancy, improved query efficiency, and better data maintenance. The potential downside of increased label editing can be managed, and the overall gain for Wikidata’s chemical data justifies this proposal. If approved, I am prepared to create the necessary items and migrate existing data.

Any further input to refine this proposal is more than welcome!

P.S.: I have no strong opinions if current chemical formula (P274) should be deleted or used on the new items as "Chemical Formula String"

discussion

edit
  •   Support sounds great! Egon Willighagen (talk) 15:25, 28 August 2024 (UTC)[reply]
      Comment Last night on the boat between Finland and Sweden I thought of another aspect where this would help model the chemistry in Wikidata better. If chemical formula are items (and thanks to GZWDer for showing various Wikipedias decided it was useful too), then they can also subclass each other. We can have an isotope-agnostic chemical formula ( the common case) and subclasses for chemical formula with isotopes.As such it does much more than being something technical (e.g. just about scalability) but actually improve how we talk about the chemistry. Egon Willighagen (talk) 07:07, 29 August 2024 (UTC)[reply]
  • Some comments:
  1. I will oppose "Additionally, this change will facilitate the use of different units for masses and allow for better structured data." - For consistency and machine-readability we should stick to one unit. I instead propose Wikidata:Property proposal/formula weight.
  2. Many wikis has pages like C15H20O4 (Q1250089). Some wikis treat it as disambiguation pages; some as set indices; we need to discuss how to handle such existing items. GZWDer (talk) 21:10, 28 August 2024 (UTC)[reply]
  • I looked at the English Wikipedia sitelink-ed page, and that actually looks exactly like a page about a chemical formula. To be honest, this actually sounds like in argument in favor of this proposal and that C15H20O4 (Q1250089) should be of type chemical formula (Q83147). The same for the French WP page, and neither say they are disambiguation pages, but are far more like a category of things with the same property. Just like this proposal, not? Egon Willighagen (talk) 06:58, 29 August 2024 (UTC)[reply]
I was only partially able to follow your mind here. In your proposal, you mention this property if created, thus you would support it? I believe the discussion about mass (P2067) (and units) or other properties is an interesting one this proposal would allow to better discuss/implement, and what I mentioned about these or what is currently on the example item are just ideas, if this new property allows for these things to also improve, even better! AdrianoRutz (talk) 08:51, 30 August 2024 (UTC)[reply]
  •   Weak oppose I cannot question arguments raised here about efficiency, but I don't see this as a proper way forward. This proposal completely fails to take into account the fact that for a given chemical entity there may be many – equally correct – chemical formulae (simple example in Q27260276#P274). Moving chemical formulae to another item will not help at all with the most important purpose for which WD exists – using this data. I would see the new property as being created only to assist with specific activities – but not to replace existing properties – and with appropriate disclaimers in the name and constraints that it is a strictly technical property only. Wostr (talk) 22:21, 28 August 2024 (UTC)[reply]
    I think this proposal has no problems with alternative formula notations, e.g. like CHAgO₃ (Q130044611). Or? Egon Willighagen (talk) 06:51, 29 August 2024 (UTC)[reply]
    CHAgO₃ and AgHCO₃ are not the same chemical formula. Just as e.g. XeF4O and XeOF4 which would require two different items for the same compound. In fact, for some compounds several new items would need to be created. For some chemical species we would have formulae that have different number of atoms of elements: C30H40F2N8O9, C15H17FN4O3·1,5H2O and C30H34F2N8O6·3H2O are correct formulae for the same compound, but I don't see a way for this to be reflected correctly by the current proposal. Everything looks fine if you consider only simple organic compounds and their formulae in Hill notation, but it's not that simple especially if we consider some inorganic compounds which are not molecules. Wostr (talk) 12:34, 29 August 2024 (UTC)[reply]
    Thank you for this important point! I removed the single value constraint, thus allowing for what you mention. AdrianoRutz (talk) 08:47, 30 August 2024 (UTC)[reply]
    Good point about non-molecular substances. I think the chemical concept we are trying to capture is that of isomerism: chemical entities are isomers when they have the same molecular formula (Q188009) or (non-structural) formula unit (Q1437643), enabling one molecule/ion/unit of the first chemical entity to be rearranged into one molecule/ion/unit of the second chemical entity by moving atoms/bonds around.
    • For example, the ionic compounds with structural formulas [CrCl(H₂O)₅]Cl₂•H₂O and [Cr(H₂O)₆]Cl₃ are (hydration) isomers, which we can recognise by assigning them the same formula H₁₂Cl₃CrO₆. This shows that all species in the crystal lattice of a compound should be combined together into a single entity when determining the formula. In the example you give above, the correct formula would be C₃₀H₄₀F₂N₈O₉, derived from combining together 2C₁₅H₁₇FN₄O₃•3H₂O, the smallest formula unit with integer multiples of all species.
    • Likewise, the molecular substance CO(NH₂)₂ and ionic compound NH₄OCN are considered isomers, which we can recognise by assigning them the same formula CH₄N₂O. This is the molecular formula of urea and the formula unit of ammonium cyanate, showing how molecular and non-molecular substances can be isomeric.
    • For ions, fulminate(1−) (Q27110286) (with structural formula CNO-) and cyanate anion (Q55503523) (with structural formula OCN-) are isomers, which we can recognise by assigning them the same formula CNO-.
    • Clathrates are similar to coordination compounds. E.g. methane clathrate (Q389036) has structural formula 4CH₄•23H₂O, yielding the formula C₄H₆₂O₂₃. Likewise, the endohedral fullerene CH₄@C₆₀ should have formula C₆₁H₄.
    • Compounds should not usually map to multiple formulas: if C links to two different formulas, one the same as A (from reference 1) and one the same as B (from reference 2), this implies C is isomeric with A, and C is isomeric with B, but A is not isomeric with B. This only makes sense if 1 and 2 disagree as to what the correct formula of C ought to be.
    • When references disagree, we may need to support multiple formulas. Historically, w:en:copper monosulfide was thought to have structure [Cu2+][S2-], corresponding to the formula CuS. It has now been assigned the structure [Cu+]₃[S2-][S₂-], which would correspond to Cu₃S₃. However, PubChem still has the old formula. We might want to update Wikidata to the new formula while also keeping the PubChem-referenced formula (with a note that it's not the correct formula).
    • Non-stoichiometric compounds, alloys, and mixtures of indeterminate composition are more complicated to support. E.g. pyrrhotite (Q421944) has formula Fe1-xS (x = 0 to 0.125). Rather than trying to support formula units with atom counts that are algebraic expressions (e.g. 1 - x), I think it would be easier if we could list the formulas of the endpoints: Fe₇S₈ and FeS. Similarly, superconducting yttrium barium copper oxide (Q414015) has formula YBa2Cu3O7−x (x = 0 to 0.65), with endpoint formulas YBa2Cu3O6.35 (i.e. Y20Ba40Cu60O127) and YBa2Cu3O7. I think it's hard to come up with a perfect solution though. InChI (P234) has similar issues for non-stoichiometric compounds: https://doi.org/10.1186/s13321-015-0068-4#Sec45.
    Preimage (talk) 17:47, 31 August 2024 (UTC)[reply]
  •   Support I also see more benefits than downsides. Support. Wostr I am not sure to understand how this would be a problem even for entities which could be described using different MF sequences of atoms like Q27260276#P274. Indeed the has part(s) (P527) and quantity (P1114) of the MF entity, see C₁₅H₂₀O₄ (Q129998552) would allow to efficiently retrieve such compounds represented in different MF notation systems. What would exactly be the inconvenient in this particular case? GrndStt (talk) 06:22, 29 August 2024 (UTC)[reply]
  •   Support, conditional on change of representation to molecular formula (Q188009). As noted in w:en:chemical formula#Types, chemical formula (Q83147) has four separate meanings: empirical formula (e.g. formaldehyde and glucose both have empirical formula CH₂O), molecular formula (e.g. urea and ammonium cyanate both have molecular formula CH₄N₂O in Hill notation, indicating they are isomers), structural formula (a graphical representation of the structure, not so relevant here), and condensed (or semi-structural) formula (e.g. urea has condensed formula CO(NH₂)₂ whereas ammonium cyanate has condensed formula [NH₄][OCN]). Molecular formulas "indicate the simple numbers of each type of atom in a molecule, with no information on structure", which is what we need for mass calculations. They also avoid the issue raised by Wostr regarding non-uniqueness of chemical formulas (e.g. NH₄NO₃ and H₄N₂O₃ are both valid formulas for ammonium nitrate), as each chemical should have a single canonical molecular formula in Hill notation (with the exception of rare cases where there is disagreement regarding structure, e.g. w:en:copper monosulfide). One last potential issue: molecular formulas are often defined as not including isotopes, e.g. PubChem lists both deuterated chloroform and chloroform as having molecular formula CHCl₃. Egon Willighagen's suggestion to have a subclass of [molecular] formulas with isotopic information would resolve this issue though, I think. Preimage (talk) 12:22, 29 August 2024 (UTC)[reply]

formula weight

edit
   Under discussion
Descriptionmolar mass of an empirical forumula unit of a chemical compound, element or isotope
Representsformula weight (Q3900742)
Data typeNumber (not available yet)
Example 1C₁₅H₂₀O₄ (Q129998552) => 264.13615911
Example 2sodium chloride (Q2314) => 57.959 (may be unnecessary once we have Wikidata:Property proposal/chemical formula)
Example 3Phosphorus pentoxide (Q369309) => 283.9 (P2O5) and 567.8 (P4O10)
Example 4chlorine (Q688) => 35.45±0.01
Example 5uranium-238 (Q1148503) => 238.050786996±0.000001602

Motivation

edit

Currently we use mass (P2067) for molar mass of a chemical compound (for individual items about compound). This is problematic for the following reasons:

So I propose to split "formula weight" of a compound to a new property that does not use unit.--GZWDer (talk) 21:07, 28 August 2024 (UTC)[reply]

Discussion

edit

  Weak oppose (1) formula weight is an outdated term, AFAIK relative molar mass is the correct one. (2) molar mass unit was rightly removed from mass (P2067) some time ago (now I saw it has been restored, which I just fixed, because different units should not be mixed in one property), (3) dalton (Q483261) is not a unit of molar mass and we don't have molar masses in our items right now, (4) whether we will have a relative molar mass without a unit or a molar mass with a unit – it has no practical significance as long as it is consistently observed within the property, (5) the most important problem, which has not been mentioned in any aspect, and which at this point prevents any use in any way of the masses of chemical entities that we already have – for each chemical entity, the mass can be calculated in many ways: as the mass taking into account the natural abundance, or the monoisotopic mass. The lack of appropriate qualifiers defining the method of calculation of the mass prevents the use of such mass and will prevent such use whether we create one new property or 10. Like the proposal regarding chemical formulae, this proposal is published too quickly and there was no prior discussion of the problem, e.g. in the WikiProject Chemistry. In my opinion, the discussion here will not enable us to develop an appropriate solution, and the proposal itself in its current form does not solve the most important problem that we currently have. Wostr (talk) 22:39, 28 August 2024 (UTC) Pinging user:Preimage as you participated in this discussion. Wostr (talk) 22:42, 28 August 2024 (UTC)[reply]

Linking also Wikidata_talk:WikiProject_Chemistry/Archive/2018#(Molar?)_mass_of_compounds. Wostr (talk) 22:42, 28 August 2024 (UTC)[reply]

Medicine

edit
Please visit Wikidata:WikiProject Medicine for more information. To notify participants use {{Ping project|Medicine}}

Mineralogy

edit
Please visit Wikidata:WikiProject Mineralogy for more information. To notify participants use {{Ping project|Mineralogy}}

Computer science

edit
Please visit Wikidata:WikiProject Informatics for more information. To notify participants use {{Ping project|Informatics}}

Geology

edit

Please visit Wikidata:WikiProject Geology for more information.

Geography

edit

Linguistics

edit

Please visit Wikidata:WikiProject Linguistics for more information. To notify participants use {{Ping project|Linguistics}}

Mathematics

edit

Please visit Wikidata:WikiProject Mathematics for more information. To notify participants use {{Ping project|Mathematics}}

‎LMFDB knowl ID

edit
Descriptionidentifier for a knowl in the L-functions and modular forms database
RepresentsLMFDB (Q130218488)
Data typeExternal identifier
Allowed values[a-z0-9_.-]+
Example 1Langlands program (Q1393253)lfunction.history.langlands_program
Example 2gamma function (Q190573)specialfunction.gamma
Example 3group ring (Q2602722)group.group_ring
Example 4Siegel modular form (Q7510567)mf.siegel
Sourcehttps://www.lmfdb.org/knowledge/
External linksUse in sister projects: [ar][de][en][es][fr][he][it][ja][ko][nl][pl][pt][ru][sv][vi][zh][commons][species][wd][en.wikt][fr.wikt].
Mix'n'match6447
Number of IDs in source1246
Formatter URLhttps://www.lmfdb.org/knowledge/show/$1
URL match pattern^https:\/\/www.lmfdb.org\/knowledge\/show\/([a-z0-9_.-]+)
Applicable "stated in"-valueLMFDB (Q130218488)
Single-value constraintyes
Distinct-values constraintyes

Motivation

edit

LMFDB has a closed wiki to complement its database of L-functions, algebraic varieties, and groups. The wiki URLs haven't changed since the launch in 2015 and there's still some activity. Dexxor (talk) 10:57, 2 September 2024 (UTC)[reply]

Discussion

edit

  Notified participants of WikiProject Mathematics

Material

edit

Please visit Wikidata:WikiProject Materials for more information. To notify participants use {{Ping project|Materials}}

Meteorology

edit

Glaciology

edit

Nutrition

edit